BGP Fundamental
BGP Fundamental
rev0207
Agenda
• Controlling Traffic
• BGP General Operation • Controlling Outbound Traffic
• BGP Multipath
• Overview
• Controlling Inbound Traffic
• eBGP
• iBGP • Route Reflectors
• Attributes and Best Path Selection Algorithm • Multiprotocol BGP
• Route Origination
• AS-PATH • Convergence
• Initial Convergence
• NEXTHOP
• BGP Routing Convergence
• Communities
• Show and Tell/Demo Lab
BGP General Operation
Overview R2 BGP
ASR1K Router_20
ASR1K
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Overview R2 BGP
ASR1K Router_20
ASR1K
ASR1K
40.40.40.0/24
BGP Table
40.40.40.0/24
Path #1: via Router_20
Path #2: via Router_3
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
Overview R2 BGP
ASR1K Router_20
ASR1K
ASR1K
BGP Table 2001:db8:100:100::40/64
2001:db8:100:100::/64
Path #1: via Router_20
Path #2: via Router_3
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
BGP General Operation
Peering
• BGP peers with other BGP speakers
• Peer is also called “neighbor” R2 R_20
• Uses TCP port 179
• BGP peers exchange routes Peering
• Picks the best path
• Installs in the routing/forwarding table
R3
• Advertises to BGP peers via UPDATEs
• UPDATEs have Attributes
• Routing policies tweak attributes to influence best path
selection
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Incremental Updates
• Once BGP sends a route to a peer, it assumes the peer will keep it
• There is no periodic refresh
• New UPDATEs are sent when
• Bestpath change
• Peer bounces
• Route-Refresh
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Autonomous System
AS 10 AS 20
R2 R_20
R1
Internet
AS 40
R_30
R3
AS30 • AS Numbers
• Historically 2 bytes
• A network sharing the same routing policy • 1 to 65535
• Possibly multiple IGPs • 64512 to 65535 are private
• Usually under single administrative control • Running out of AS numbers…
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
IGP vs. EGP
• IGP – Interior Gateway Protocol
• Exchange routes within an Autonomous Systems
• Limited Scalability
• Sub-second convergence
• EIGRP, ISIS, OSPF etc.
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
eBGP Peering
eBGP - External BGP
• Neighbor in different AS
• Usually directly connected AS 20 AS 40
• Next Hop set to self R_20
External
eBGP Internet
AS #s
(Autonomous System Numbers)
≠
TTL
(Time to Live)
1 (default) R_30
Next Hop Change
AS 30
Directly Connected Check Enabled
(default)
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
eBGP - External BGP
AS 20 AS 40
Configuration
R_20
R_20
router bgp 20
bgp router-id 20.100.100.20
neighbor 5.20.40.40 remote-as 40 Internet
neighbor 5.20.40.40 send-community
Internet
router bgp 40
router-id 40.100.100.40 R_30
neighbor 5.20.40.20 remote-as 20
address-family ipv4 unicast AS 30
send-community
Router_20#sh ip bgp summary
BGP router identifier 20.100.100.20, local AS number 20
<snip>
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
5.20.40.40 4 40 8256 9102 4 0 0 5d17h 3
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
eBGP Multihop
• Peer between loopbacks
• Often used to load-balance traffic over multiple links
R_2 AS 10
router bgp 10
AS 20
neighbor 10.1.20.1 remote-as 20
neighbor 10.1.20.1 update-source loop0
neighbor 10.1.20.1 ebgp-multihop 2 R2 R20
ip route 10.1.20.1 255.255.255.255 s0/0
ip route 10.1.20.1 255.255.255.255 s1/0
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
eBGP Multihop
• Peer between loopbacks
• Often used to load-balance traffic over multiple links
R_2 AS 10
router bgp 10
AS 20
neighbor 10.1.20.1 remote-as 20
neighbor 10.1.20.1 update-source loop0
neighbor 10.1.20.1 disable-connected-check R2 R20
ip route 10.1.20.1 255.255.255.255 s0/0
ip route 10.1.20.1 255.255.255.255 s1/0
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
iBGP Peering
iBGP - Internal BGP
R2
• Neighbor in same AS
• NEXTHOP is unchanged AS 10
• Peer to loopbacks
R3
Internal
iBGP
AS #s
=
(Autonomous System Numbers)
TTL
255
(Time to Live)
Next Hop unchanged
Directly Connected Check disabled
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
iBGP - Internal BGP
• Cannot advertise route received from
one iBGP peer to another iBGP peer AS 10
• Full iBGP mesh is required
• n*(n-1)/2 peering mesh – scaling
problem! R2
• Route-Reflectors relax this constraint
R1
R3
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
iBGP – Loopback Peering
• Best Practice
• Loopbacks should be /32s R2
• Have an IGP route to loopbacks
• Configuration AS 10
R_2
router bgp 10
bgp router-id 10.100.100.2
neighbor 10.100.100.3 remote-as 10 R3
neighbor 10.100.100.3 update-source Loopback0
neighbor 10.100.100.3 next-hop-self
R3
router bgp 10
bgp router-id 10.100.100.3
neighbor 10.100.100.2 remote-as 10
neighbor 10.100.100.2 update-source Loopback0
neighbor 10.100.100.2 next-hop-self TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
iBGP – Loopback Peering
• Loopback peering promotes stability
• There are two paths between R1 and R2
R2
• If the link between them fails
• Peering with interface IP would bring down the
BGP session
• Peering to a loopback allows the session to stay R1
up
R3
AS 10
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Attributes and Best Path
Selection Algorithm
Attributes
• IGP
• Primary attribute is a cost/metric
• The path with the lowest metric is the best…nice and easy
• BGP
• Routing Policy between AS is usually more complex
• Shortest path is not necessarily the best one
• Has many attributes to describe reachability to a destination
• The “Best Path Algorithm” compares attributes between different paths to select the best
• Route-policies are used to tweak attributes to influence outcome of Best Path: routing
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
BGP Route vs. BGP Path
• BGP can have multiple paths per route
• Here we have 2 paths to the 40.40.40.0/24 prefix
R2#show ip bgp 40.40.40.0
BGP routing table entry for 40.40.40.0/24
Paths: (2 avail, best #2, table default)
30 40
10.100.100.3 (metric 2) from 10.100.100.3
Origin IGP, metric 0, localpref 100, valid, internal
20 40
20.2.20.20 from 20.2.20.20 (20.100.100.20)
Origin IGP, localpref 100, valid, external, best
Community: 40:1
R2#
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
BGP Route vs. BGP Path
• “show ip bgp summary” provides the total number of routes and paths
• Paths and routes both consume memory
• The more paths you have per route, the more memory consumed
R3#show ip bgp summary
BGP router identifier 10.100.100.3, local AS number 10
BGP table version is 4, main routing table version 4
3 network entries using 432 bytes of memory
6 path entries using 480 bytes of memory
5/3 BGP path/bestpath attribute entries using 800 bytes of memory
2 BGP AS-PATH entries using 48 bytes of memory
1 BGP community entries using 24 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 1784 total bytes of memory
BGP activity 9/6 prefixes, 28/22 paths, scan interval 60 secs
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
BGP Path Selection Algorithm
Attribute Logic
1 Weight Higher is better. Local to the router…not really an attribute.
2 Local Preference Local to an AS…higher is better
3 Locally Originated Corner case…”network 10.0.0.0” vs. “aggregate 10.0.0.0”
vs. “redistribute” on the same router
4 AS-PATH Shorter AS-PATH is better
5 ORIGIN IGP < EGP < Incomplete
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
BGP Path Selection Algorithm (contd)
Attribute Logic
9 Lowest Router ID Lower is better
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
BGP Path Selection Algorithm
• Hard to remember“Denise”ism
BGP Attribute
all of that?
Weight Wise
AS-PATH Apply
ORIGIN Oral
MED Medication
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
AS 20
Overview R2
ASR1K
EBGP Router_20 ASR1K
40.40.40.0/24
Internet
AS 40
I am AS 40 and I own 40.40.40.0/24
router bgp 40
Internet
router-id 40.100.100.40
address-family ipv4 unicast
network 40.40.40.0/24
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Route Origination – Network Statements
• Easiest/Cleanest method
• Network 40.40.40.0 mask 255.255.255.0
• Requires 40.40.40.0/24 to be in the RIB
• Floating static route to Null0 is common
• Originates 40.40.40.0/24
• Easy to determine/control what you are originating
router bgp 40
router-id 40.100.100.40
address-family ipv4 unicast
network 40.40.40.0/24
!
ip route 40.40.40.0 255.255.255.0 Null0 250
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
Route Origination – Network Statements
London-Internet# show ip bgp 40.40.40.0
BGP routing table information for VRF default, address family IPv4
Unicast
BGP routing table entry for 40.40.40.0/24, version 27
Paths: (1 available, best #1)
Flags: (0x080002) on xmit-list, is not in urib
Advertised path-id 1
Path type: local, path is valid, is best path
AS-Path: NONE, path locally originated
0.0.0.0 (metric 0) from 0.0.0.0 (40.100.100.40)
Origin IGP, MED not set, localpref 100, weight 32768
Path-id 1 advertised to peers:
5.20.40.20 5.30.40.30 The Origin is IGP
London-Internet# Weight is 32768
“0.0.0.0 from 0.0.0.0”
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Route Origination – Redistribution
• Routes can be redistributed into BGP
• Pros
• Easy to configure and setup
• Cons
• IGP instability is passed along to BGP
• Isn’t always obvious what routes you are originating
• “Redistribute static” is especially dangerous
• What if someone configures a static route for Google’s address space?
• You could blackhole Google’s traffic
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Route Origination – Redistribution
• Things to note
• The nexthop for the OSPF route is 10.1.1.14
• The OSPF metric is 11
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Route Origination – Redistribution
• NEXTHOP uses the IGP nexthop of 10.1.1.14
• ORIGIN is set to “Incomplete”
• “metric” here means MED
• Uses the IGP metric of 11
• Weight is 32768
R10#show ip bgp 10.1.1.3
BGP routing table entry for 10.1.1.3/32, version 5
Paths: (1 available, best #1, table default)
Advertised to update-groups:
9
Local
10.1.1.14 from 0.0.0.0 (10.1.1.2)
Origin incomplete, metric 11, localpref 100, weight
32768, valid, sourced, best TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Route Origination – Aggregation
• Typically used by ISPs to summarize their address space
• Reduces number of routes in global BGP table
• Adds AGGREGATOR attribute
• Contains Router-ID and AS of the router that did the aggregation
• Used for troubleshooting
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Route Origination – Aggregation
• Configure an “aggregate-
address” statement AS 100 Check for component route(s)
R11#show ip bgp 10.1.0.0 255.255.0.0 longer
• BGP table must have 10.1.1.0/24, 10.1.2.0/24, etc listed here
R11
component route(s) R11#
R12
AS 200
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
AS-PATH
AS-PATH
• The AS-PATH tells the story of which AS a route has traversed
• A BGP speaker prepends its own AS# to the AS-PATH when
advertising to an eBGP peer
• AS-Path is used for loop detection on the border of the AS
• BGP drops an external update if it sees its own AS in the path
• When viewing the AS-PATH, the most recent AS is on the left, the
originating AS is on the far right
• Shortest AS-PATH is often the tie-breaker for best path selection
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
AS-PATH
AS 10 AS 20 AS 40
R2 R20
Internet
R3 R30
AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
AS-PATH
R2#show ip bgp 40.1.1.0
BGP routing table entry for 40.1.1.0/24, version 6
Paths: (2 available, best #2, table default)
Advertised to update-groups:
14
Refresh Epoch 1
30 40
10.100.100.3 (metric 2) from 10.100.100.3
(10.100.100.3)
Origin IGP, metric 0, localpref 100, valid, internal
rx pathid: 0, tx pathid: 0
Refresh Epoch 1
20 40
20.2.20.20 from 20.2.20.20 (20.100.100.20)
Origin IGP, localpref 100, valid, external, best
rx pathid: 0, tx pathid: 0x0
R2#
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
NEXTHOP
NEXTHOP
• NEXTHOP is the address that we must route towards in order to reach
the BGP prefix
• Paths where the next-hop is unreachable are not considered for best-path
calculation
• eBGP does “next-hop-self” automatically
• Multiple eBGP peers on the same subnet is an exception
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
iBGP without next-hop-self
NEXTHOP does not
change AS 10
AS 10’s IGP must have R2
route to 30.1.1.9
Adds many /30s to IGP
R1
E0/0
30.1.1.9
R3 R5
AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
iBGP with next-hop-self
R3 changes NEXTHOP
to his “update-source” AS 10
interface R2
iBGP should always
use loopback peering
AS 10’s IGP has a R1
route to R3’s loopback
10.1.1.3
E0/0
30.1.1.9
R3 R5
Loop0
10.1.1.3 AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Communities
Communities
• A COMMUNITY is an attribute that stores a number
• 4-byte number that is usually displayed in X:Y notation
• “ip bgp-community new-format” triggers X:Y notation
• A community by itself does nothing
• Tagging a prefix with 100:1 or 100:2 will not change routing in any way
• Set communities via a route-map
• Communities are not advertised by default
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Sending Communities
R2#
router bgp 20
neighbor 10.1.1.2 remote-as 10 AS 20
neighbor 10.1.1.2 send-community
neighbor 10.1.1.2 route-map TAG_MY_ROUTES out
! R2
ip bgp-community new-format
!
route-map TAG_MY_ROUTES permit 10
AS 10
set community 10:1
! R1
R2#
R3
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Receiving Communities
• Applying Policy towards communities does impact routing
• Use route-maps and community-list to
• Match against a certain community
• Modify a BGP attribute as a result
• LOCALPREF, ASPATH prepending, etc
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Communities
R1#
router bgp 10
neighbor 20.1.1.1 description R2_PEER
neighbor 20.1.1.1 route-map R2_OR_R3 in
neighbor 30.1.1.1 description R3_PEER
neighbor 30.1.1.1 route-map R2_OR_R3 in AS 20
ip community-list standard VIA_R2 permit 100:1
ip community-list standard VIA_R3 permit 100:2 R2
AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Well Known Communities
• “A community by itself does nothing”
• There are exceptions to every rule
• Well Known Communities do have an automatic impact
Community Impact
local-AS Do not send to EBGP peers
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Controlling Outbound
Traffic
Applying BGP Policy
• Policy based on various attributes:
• ASPATH
• Community
• Destination prefix
• Many, many others…
• Reject/accept selected routes
• Set attributes to influence path selection
• Tools (IOS):
• Distribute-list or prefix-list
• Filter-list (as-path access-list)
• Community-list
• Route-maps (the Swiss army knife)
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
Policy Control - Prefix List
• Per-peer prefix filter, inbound or router bgp 200
outbound neighbor 220.200.1.1 remote-as 210
neighbor 220.200.1.1 prefix-list PEER-IN in
• Allows coverage for ranges of
neighbor 220.200.1.1 prefix-list PEER-OUT out
prefix lengths (ge, le)
!
• Based upon network numbers in ip prefix-list PEER-IN deny 218.10.0.0/16
NLRI (using familiar IPv4 ip prefix-list PEER-IN permit 0.0.0.0/0 le 32
address/mask format) ip prefix-list PEER-OUT permit 215.7.0.0/16
ip prefix-list PEER-OUT deny 0.0.0.0/0 le 32
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Policy Control - Prefix List
a.b.c.d/x [ge | eq | le] y
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Policy Control - Filter List
• Filter routes based on AS path
• Inbound or Outbound
• Example Configuration:
!
router bgp 100
neighbor 220.200.1.1 filter-list 5 out
neighbor 220.200.1.1 filter-list 6 in
!
ip as-path access-list 5 permit ^200$
ip as-path access-list 6 permit ^150$
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Policy Control - Regular Expressions
• Simple Examples
• .* Match anything
• ^$ Match routes local to this AS (as-path is empty)
• _1800$ Originated by 1800 (as-path ends with 1800)
• ^1800_ Received from 1800 (as-path starts with 1800)
• _1800_ AS 1800 is somewhere in the as-path
• _790_1800_ Passing through 790 then 1800
• 1800 Literal “1800” is somewhere, also matches 21800, 18001, etc.
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
Local Preference
• An attribute used to influence outbound traffic
• Higher LOCAL_PREF is preferred
• Is compared very early in the Best Path Algorithm
• Is local to an AS
• Local preference is never transmitted to an eBGP peer
• A default LP of 100 is applied to routes from eBGP peers
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
Local Preference
• Default behavior…LOCALPREF 100
• R2 and R3 prefer eBGP path
• R1 prefers path from R2 over R3 (lower neighbor IP)
AS 10 AS 20 AS 40
R2 R4
R1
R6
R3 R5
TECRST-1310
AS 30
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Local Preference
• R2 advertises LOCALPREF of 200
• R1, R2, and R3 all prefer the R2 exit
AS 10 AS 20 AS 40
R2 R4
R1
R6
R3 R5
AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Local Preference
AS 10 R2#
!
R2 router bgp 10
neighbor 10.1.1.1 remote-as 10
neighbor 10.1.1.1 route-map SET_LOCAL_PREF out
neighbor 10.1.1.3 remote-as 10
neighbor 10.1.1.3 route-map SET_LOCAL_PREF out
R1 !
route-map SET_LOCAL_PREF permit 10
set local-preference 200
!
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Alternatives to Local Preference
BGP
• Local preference is a very “heavy” attribute to influence routing, as it Attribute
Local
• Especially with Internet routing, AS path length is very important (how Preference
“far” is the destination) Locally
Originated
• Hence, evaluate attributes for best path manipulation for your design AS-PATH
• No one size fits all, there are lots of ways to implement BGP routing ORIGIN
policies… MED
NEXTHOP IGP
Cost
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
BGP Multipath
BGP Multipath
• R1 receives two paths from AS20 (via R2 and
R3)
AS 20
• Best-path algorithm selects one and installs it
in routing table
• Assuming all attributes are equal, uses the one R4
from the lower neighbour IP address
• By default, all of the traffic goes via one link
only R1
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
eBGP Multipath
• Enable eBGP multipath on R1 to install both
paths
router bgp 10 AS 20
maximum-paths 2
• Multipath selection is part of the Best Path R4
algorithm
• Evaluated before the more arbitrary tie breakers
like IP address/etc. R1
• Only paths with identical ASPATH will be
considered
R5
• Hidden knob “bgp bestpath as-path multipath
relax” changes this, but be aware of what you’re AS 10
doing
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
iBGP Multipath
• In this topology, eBGP Multipath will
not help AS 10 AS 20
• R1 will choose one of the internal
paths, and will select one R2 or R3 R2 R4
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
Controlling Inbound Traffic
Controlling Inbound Traffic
• The first rule of controlling inbound traffic…
• You do not have ultimate control of how traffic enters your AS
• Your peers may have outbound policies that will override all of your attempts to
influence inbound traffic
• That said, what are your options?
• Leaking more-specific routes
• MED
• AS-PATH Prepending
• Community/Local Pref agreement
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Leaking Specific Routes
• A RIB lookup always looks for the most specific match
• A route for 10.1.1.1/32 will be used over 10.1.1.0/24
• You can leak more specific routes to one ISP but not the other
• If the routes are not filtered this will draw the traffic in through the preferred ISP
• Some argue: Advertising more specifics to the global Internet is not “nice” as it
causes the Internet BGP table to bloat, and everyone has to bear the costs..
• Many ISPs filter routes that are too specific
• You can’t advertise /32s for your entire address space
• These will obviously be filtered
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Leaking Specific Routes
• You are AS 10
AS 10 10.1.1.0/24
AS 20
• AS 10 owns R2 R4
10.1.1.0/24
• AS 20 only uses one
link to send traffic to
R1
AS 10
• You want to utilize
both links
R3
Traffic
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
Leaking Specific Routes
• Split your /24 in two /25s
• R2
• advertise 10.1.1.0/25
• suppress 10.1.1.128/25 10.1.1.0/25
AS 10 AS 20
• R3
R2 R4
• suppress 10.1.1.0/25
• advertise 10.1.1.128/25
• AS 20 will now send traffic R1
on both links
• Maybe…maybe not AS 10 AS 20
www.espn.com
R2 R4
• In this case the R2 link will 10.1.1.10
receive much more traffic
than the R3 link
R1
www.watching-
paint-dry.com R3
10.1.1.140
Traffic
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
MED
• Officially “Multi Exit
Discriminator”
• An attribute used to
influence inbound traffic 10.1.2.0/24
AS 10 MED: 1 AS 20
• Lower MED is better 10.1.3.0/24
MED: 2
• MED is designed to be a R1 R2 R5
reflection of IGP metrics
• A lower IGP metric is
always preferred 10.1.2.0/24
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
MED
MEDs can be set manually
“set metric-type internal” sets MED dynamically
Uses IGP cost to prefix as the MED value
R2 has an IGP cost of 1 to 10.1.2.0
10.1.2.0/24
R2 has an IGP cost of 2 to 10.1.3.0 AS 10 MED: 1 AS 20
10.1.3.0/24
MED: 2
R1 R2 R5
R2#
router bgp 10
neighbor 10.1.1.5 remote-as 20 10.1.2.0/24
neighbor 10.1.1.5 SET_MED out
!
route-map SET_MED permit 10 10.1.3.0/24
set metric-type internal
!
R4 R3
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
MED
• Traffic for 10.1.2.0/24 uses the R2 link
• Traffic for 10.1.3.0/24 uses the R3 link
10.1.2.0/24
AS 10 MED: 1 AS 20
10.1.3.0/24
MED: 2
R1 R2 R5
10.1.2.1
10.1.2.0/24
10.1.3.0/24
R4 R3
Traffic
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
MED – bgp always-compare-med
• MEDs are only compared if received from the same AS
• Makes sense as you can’t necessarily compare routing policies across different AS
• R6 does not compare MEDs for the paths received from AS20 and AS30 unless “bgp
always-compare-med” is configured
AS 10 AS 20 AS 40
R2 R4
R1
R6
R3 R5
AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
AS-PATH Prepending
• AS 10 can force traffic into R3 by prepending from R2 R4
• A shorter ASPATH is preferred
AS 10 AS 20 AS 40
R2 R4
R1
R6
R3 R5
AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
AS-PATH Prepending
R2#
router bgp 10
neighbor 10.1.1.4 remote-as 20
neighbor 10.1.1.4 route-map PREPEND_3X out
!
route-map PREPEND_3X permit 10
set as-path prepend 10 10 10
!
AS 10 AS 20
R2 R4
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
Community/Local Pref Agreement
• Many providers accept communities
from their customers to give customers
some control on inbound traffic.
AS 10 AS 20
• Example R1 R3
• Customer sends community 20:80, ISP
sets the LOCALPREF to 80
• Customer sends community 20:120, ISP
sets the LOCALPREF to 120
R2 R4
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Community/LOCALPREF Agreement
R1#
router bgp 10
neighbor 10.1.1.3 remote-as 20
neighbor 10.1.1.3 route-map SET_COMMUNITY out
neighbor 10.1.1.3 send-community
! AS 10 AS 20
route-map SET_COMMUNITY permit 10
set community 20:120 R1 R3
!
R2 R4
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Community/LOCALPREF Agreement
R3#
router bgp 20
neighbor 10.1.1.1 remote-as 10
neighbor 10.1.1.1 route-map COMMUNITY_TO_LOCALPREF in
!
ip community-list standard LP_80 permit 20:80 AS 20
ip community-list standard LP_120 permit 20:120
! R1 R3
route-map COMMUNITY_TO_LOCALPREF permit 10
match community LP_80
set local-preference 80
! AS 10
route-map COMMUNITY_TO_LOCALPREF permit 20
match community LP_120
set local-preference 120
!
route-map COMMUNITY_TO_LOCALPREF permit 30
! R2 R4
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Enterprise Multi-Homed
Internet Edge
Architectures
Michael Kowal, BRKRST-2044
Intern
• Single Router, 1 Link
ISP A et ISP B
• Single Router, 2 Links (Equal and Unequal BW)
Ingress
• Multiple Routers, Multiple Links (Equal and Unequal BW)
Egress
• Multiple Routers, Multiple Firewalls, Multi-Site
R1 R2
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Troubleshooting BGP
Vinit Jain
BRKRST-2044
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Agenda
• Controlling Traffic
• BGP General Operation • Controlling Outbound Traffic
• BGP Multipath
• Overview
• Controlling Inbound Traffic
• eBGP
• iBGP • Route Reflectors
• Attributes and Best Path Selection Algorithm • Convergence
• Route Origination • Initial Convergence
• AS-PATH • BGP Routing Convergence
• NEXTHOP
• High Availability
• Communities
• Show and Tell/Demo Lab
Route Reflectors
• A route received from one iBGP peer will
NOT be advertised to another iBGP peer
AS 10
• Full iBGP mesh is required
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Route Reflectors
• A route received from one iBGP peer will
NOT be advertised to another iBGP peer
AS 10
• Full iBGP mesh is required R1
R2
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Route Reflector Basics
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
Route Reflector Basics
• A non-client is any iBGP peer that is Route reflectors
not a route reflector client
Non-client
• Each route reflector is also a non-
client of each other route reflector in Cluster
this network
• Route reflectors must be fully iBGP
meshed
A
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
Route Reflector – Advertisement Rules
eBGP peer
If a Route Reflector Receives a Route
from an eBGP Peer what will it do?
RR Send
• Send the route to ALL BGP peers (iBGP
and eBGP) Send
Send
Non-client
iBGP peer
Client
Client
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
Route Reflector – Advertisement Rules
If a Route Reflector Receives a Route
eBGP peer
from a Client what will it do?
Send
• Reflect the route to all clients Reflect
RR
• Reflect the route to all non-clients
Reflect
Non-client
• Send the route to all eBGP peers iBGP peer
Client
Client
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
Route Reflector – Advertisement Rules
Non-client
If a Route Reflector Receives a Route iBGP peer
eBGP peer
from a Non-Client what will it do?
Send
• Reflect the route to all clients
RR
• Send the route to all eBGP peers Reflect
Reflect
Non-client
iBGP peer
Client
Client
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
Route Reflector Design and Redundancy
A client may peer with more than one reflector
• A client that peers to only one reflector has a single point of failure
Questions:
• How many reflectors should a single client be peered to?
• Where should the RRs be placed in the network?
• How many RRs are needed?
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
Route Reflector Design and Redundancy
• Redundancy is needed but….
• Too much burns memory on RRCs because the client learns the same
information from each RR
• Also burns memory on the RRs because they learn multiple paths for each route
introduced by a RRC
• Two route reflectors per client should be plenty…
• …but this is not a hard and fast rule
• As with everything else…”it depends”
• PEs, RRs, SLAs, network size, network topology, etc.
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
A word of reason
• Most routers sold in the last decade can easily run 100 or more sessions (all
depends on number of prefixes carried)
• ASR1000-RP2 scales to thousands of sessions (Isocore tested 20 Million routes
with 1000 RR clients)
• So RP performance is often not the limiting factor of a full iBGP mesh, it’s rather
the manageability adding/removing nodes from the mesh
• So don’t over-engineer it…
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
Dynamic Neighbors
• Remote peers are defined by IP address range
• Less configuration for defining neighbors
• Remote initiate BGP session
• Enterprise networks (DMVPN, ...)
1
router bgp 1 R1
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 104
Control-plane Evolution
• Many services are Service/transport 200x and before 2013 and future
moving towards BGP to IDR (Peering) BGP BGP (IPv6)
disseminate control- SP L3VPN BGP BGP + FRR + Scalability
plane information SP Multicast VPN PIM BGP Multicast VPN
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 105
BGP Address and Sub-Address Families
• BGP can advertise multiple Network Protocols’ reachability information Multi-Protocol-BGP
• Address Family (AF): Network Protocol Type (ex: IPv4, IPv6, CLNS, etc.)
• Subsequent Address Family (SAF): Additional semantic to the above, for example unicast or multicast,
MPLS VPN addresses, etc.
• Some examples (far from exhaustive):
SAF SAF
AFI Description AFI Description
I I
1 1 IPv4 Unicast 1 128 L3VPN IPv4 unicast
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 107
MPLS-VPN– L3VPN
• Layer 3 VPN carries customer routing information across an MPLS core
• Customer addresses can overlap (think: multiple enterprise customers all
using 10.0.0.0/8)
• Problem: How do we differentiate them on the control plane (BGP) and on
the forwarding plane (within the backbone)
CE
10.1.1.0/24 CE
PE 1 PE 2
CE
CE
10.1.1.0/24
MPLS Backbone
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 108
MPLS-VPN– L3VPN
• Solution:
1. Control Plane: Make addresses distinguishable by adding an addtl. identifier:
Route Distinguisher (RD): 555:9876:10.1.1.0/24
2. Forwarding Plane: Add routing contexts on PEs, and carry packets as MPLS labeled
packets across the backbone.
• BGP advertises VPNv4 addresses (8 byte RD + 4 byte IPv4 addresses) and a label as NLRI
• Other BGP attributes (AS-Path, Next-Hop, etc.) are included as seen before
NRLI: 555:987610.1.1.0/24
Label: 345
Next-Hop: PE1
... CE
10.1.1.0/24 CE MP-iBGP session
PE 1 PE 2
CE
CE
10.1.1.0/24
MPLS Backbone
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 109
BGP Routing Convergence:
Initial Convergence
BGP Convergence
• Initial startup
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
Convergence: Initial Startup
Initial convergence happens when:
• A router boots
• RP failover
• clear ip bgp *
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 112
Convergence: Initial Startup
Question: During initial convergence, what work needs to be done?
• Accept routes from all peers
• Not too difficult
• Calculate bestpaths
• This is pretty easy
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 113
Convergence: Key Variables
• BGP Variables
• The number of routes
• The number of peers
• The number of update-groups
• The ability to advertise routes to each peer/update-group efficiently
• Router Variables
• CPU horsepower
• Code version
• Interface bandwidth and input & output queues
• Network Variables
• Health of underlying network and transport
• MTU
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 114
Convergence: Key Variables
• BGP Variables
• The number of routes
• The number of peers
• The number of update-groups
• The ability to advertise routes to each peer/update-group efficiently
• Router Variables
• CPU horsepower
• Code version
• Interface bandwidth and input & output queues
• Network Variables
• Health of underlying network and transport
• MTU
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 115
Convergence: UPDATE Packing
• UPDATE contains a set of Attributes and a list of prefixes (NLRI)
• BGP starts an UPDATE by building an attribute set
• BGP then packs as many destinations (NLRIs) as it can into the UPDATE
• Only NLRI with a matching attribute set can be placed in the UPDATE
• NLRI are added to the UPDATE until it is full (4096 bytes max)
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 116
Convergence: UPDATE Packing
• The fewer attribute sets you have the better
• More NLRI will share an attribute set
• Fewer UPDATEs to converge
• Things you can do to reduce attribute sets
• next-hop-self for all iBGP sessions
• Don’t accept/send communities you don’t need
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 117
Convergence
TCP MSS – Max Segment Size
TCP MSS (max segment size) is also a factor in convergence times. The larger the
MSS the fewer TCP packets it takes to transport the BGP updates. Fewer packets
means less overhead and faster convergence.
Increased MSS IP Header TCP Header Attribute NLRI ..NLRIs.. NLRI ..NLRIs.. NLRI
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 118
Convergence
TCP MSS – Max Segment Size
• MSS – Max Segment Size
• Limit on packet size for a TCP socket
• 536 bytes by default
• Path MTU Discovery
• Finds smallest MTU between R1 and R2
• Subtract 40 bytes for TCP/IP overhead
• Enabled by default for BGP (at least in recent releases)
• In older releases enable via global cmd “ip tcp path-mtu-discovery”
• To find the MSS
R1#sh ip bgp neighbors
BGP neighbor is 2.2.2.2, remote AS 3, external link
Datagrams (max data segment is 1460 bytes):
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 119
Convergence
Update Groups
• BGP must create updates based on the policies
towards each peer Less Efficient – Two peers in different
update-groups
• Peers with a common outbound policy are Attribute NLRI NLRI
members of the same update-group
• iBGP vs. eBGP Attribute NLRI NLRI
• Outbound route-map, prefix-lists, etc
• UPDATEs are generated for one member of an More Efficient – Two peers in
update-group and then replicated to the other the same update-group
members
Attribute NLRI NLRI
• Back in the old days, these “update-groups” had to
be created specifically, using “peer-groups”.
They’re still widely deployed…
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 120
Convergence: Key Variables
• BGP Variables
• The number of routes
• The number of peers
• The number of update-groups
• The ability to advertise routes to each peer/update-group efficiently
• Router Variables
• CPU horsepower
• Code version
• Interface bandwidth and input & output queues
• Network Variables
• Health of underlying network and transport
• MTU
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 121
Convergence
Dropping TCP Acks
Primarily an issue on RRs (Route Reflectors) with RR
• One or two interfaces connecting to the core
• Hundreds of RRCs (Route Reflector Clients) BGP UPDATEs
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 122
Convergence
Dropping TCP Acks
• Interface input queue fills up…TCP ACKs are dropped
• Each time a TCP packet is dropped, the session goes into slow start
• It takes a good deal of time for a TCP session to come out of slow start
• Increase the input queue
• hold-queue 1000 in
• If you still see drops increase to 4096
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
Convergence
Question: How do you know if BGP has converged?
Answer: BGP Table Version
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 124
BGP Table Version
• Understanding the BGP Table Version – Part 1: Introduction to BGP Table Version
http://www.networkingwithfish.com/understanding-the-bgp-table-version-part-1-introduction-to-bgp-table-version/
• Understanding the BGP Table Version – Part 2: BGP Table Version in Action
http://www.networkingwithfish.com/understanding-the-bgp-table-version-part-2-bgp-table-version-in-action/
• Understanding the BGP Table Version – Part 3: BGP Table Version & Troubleshooting
http://www.networkingwithfish.com/understanding-the-bgp-table-version-part-3-bgp-table-version-troubleshooting/
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 126
Convergence
Initial Convergence Summary
• Initial convergence time is a factor of the amount of work that needs to be done
and the router/network’s ability to do this fast and efficiently
• Reduce the number of attributes sets in BGP
• Use next-hop-self, don’t send/accept communities you don’t need, etc.
• Reduce the number of unique outbound policies towards all peers
• Try to find a small set of common policies, rather than individualizing policies per peer
• The fewer update-groups the better
• MSS/PMTU
• Efficient packaging of BGP messages in TCP
• Stop TCP ACK drops
• Increase interface input queues on RRs
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 127
BGP Routing Convergence:
Reaction to Failure
IGP vs. BGP Convergence
• IGP (OSPF/ISIS) deals with hundreds routes
• Max a few thousands, but only a few hundreds are really important/relevant
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 129
BGP Control-Plane Convergence Components
• Failure Detection
• Reaction to Failure
• Failure Propagation
Convergence =
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 130
Failure Detection (Edge)
• Problem: Detect an eBGP neighbour
failure
• Available Methods router bgp …
[no] bgp fast-external-fallover
• Fast External Fallover – monitors line interface …
protocol for directly connected ip bgp fast-external-fallover {permit|deny}
neighbours (default behaviour)
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 131
Failure Detection – Next-Hop Failure
• Goal: Detect next-hop failures (as carried in IGP)
• Methods:
• Next-hop Tracking, enabled by default
• BGP scanner (legacy, very slow reaction)
• Note: On most cases, we do not want to use iBGP hellos to detect internal/iBGP
neighbor failures, and instead rely on next-hop reachability checks
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 132
Control vs. Data Plane Convergence – BGP PIC
• Control Plane Convergence
• For the topology after the failure, the optimal path is known and installed in the
dataplane
• May be extremely long (depends on number of prefixes carried)
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 134
Deploying BGP Fast
Convergence / BGP PIC
Oliver Boehmer, BRKIPM-2265
BGP Prefix Independent P1
PE1
P2
BGP Net
110.0.0.0/24
IGP pathlist
BGP Net
110.1.0.0/24 PE1 via P1 Gig1, dmac=x
BGP pathlist PE1 via P2
… PE1
PE2 IGP pathlist
Gig2, dmac=y
• Pointer Indirection between BGP and IGP entries allow for immediate update of the multipath BGP pathlist at
IGP convergence
• Used in newer IOS and IOS-XR (all platforms), enables Prefix Independent Convergence
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Wrapping up
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 137
“Show and Tell”
Loop0 Loop0
10.100.100.2 20.100.100.20
AS 10 2001:db8:2:2::2/128 2001:db8:20:20::20/128
20.2.20.0
G0/3 G0/0/0
R2 R20
IOSv .2 2001:db8:2:20:: .20 IOS XE
G0/0/1
10.1.2.0 G0/1
G0/2 .20
2001:db8:1:2:: .2
Gig0/1
.2
AS 20 G0/1
AS 40
.1 .40
R1 10.2.3.0
2001:db8:2:3:: Internet
Loop0 G0/2 IOSv
10.100.100.1
2001:db8:3:3::3/128
10.1.3.0 G0/1 G0/2 AS30 G0/2
.40
Loop0
40.100.100.40
2001:db8:1:3:: .3 .3
2001:db8:40:40::40/128
G0/3 G0/0/0/0
R3 R30 G0/0/0/1
IOSv .3 .30 IOS XR .30
30.3.30.0
Loop0 2001:db8:3:30::
Loop0
10.100.100.3
30.100.100.30
2001:db8:3:3::3/128
2001:db8:30:30::30/128
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 139
BGP Show and Tell: Beginners
https://www.youtube.com/playlist?list=PLVuziKl5zsd6VW41lIl3SWC3nT1oISZBj
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 140
Complete Your Online Session Evaluation
• Please complete your Online
Session Evaluations after each
session
• Complete 4 Session Evaluations &
the Overall Conference Evaluation
(available from Thursday) to receive
your Cisco Live T-shirt
• All surveys can be completed via
the Cisco Live Mobile App or the
Don’t forget: Cisco Live sessions will be available
Communication Stations for viewing on-demand after the event at
CiscoLive.com/Online
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 141
Cisco Spark
Ask Questions, Get Answers, Continue the Experience
The Spark Room will be open for 2 weeks after Cisco Live
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 142
Continue Your Education
• Demos in the Cisco campus
• Walk-in Self-Paced Labs
• Lunch & Learn
• Meet the Engineer 1:1 meetings
• Related sessions
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 143
Thank You