0% found this document useful (0 votes)
374 views141 pages

BGP Fundamental

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
374 views141 pages

BGP Fundamental

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 141

Deployment and Operation of BGP

Denise Fishburne, Oliver Böhmer


TECRST-1310

rev0207
Agenda
• Controlling Traffic
• BGP General Operation • Controlling Outbound Traffic
• BGP Multipath
• Overview
• Controlling Inbound Traffic
• eBGP
• iBGP • Route Reflectors
• Attributes and Best Path Selection Algorithm • Multiprotocol BGP
• Route Origination
• AS-PATH • Convergence
• Initial Convergence
• NEXTHOP
• BGP Routing Convergence
• Communities
• Show and Tell/Demo Lab
BGP General Operation
Overview R2 BGP
ASR1K Router_20
ASR1K

Routing Information Base


RIB Internet
40.40.40.0/24
BGP ASR9K

BGP Router_30 40.40.40.0/24


ASR9K
R3
ASR1K

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Overview R2 BGP
ASR1K Router_20
ASR1K

Routing Information Base


RIB Internet
40.40.40.0/24
BGP N7K

BGP Best Path BGP Router_30


Algorithm R3
ASR9K

ASR1K
40.40.40.0/24
BGP Table

40.40.40.0/24
Path #1: via Router_20
Path #2: via Router_3

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
Overview R2 BGP
ASR1K Router_20
ASR1K

Routing Information Base


RIB Internet
2001:db8:100:100::/64
BGP N7K

BGP Best Path BGP Router_30


Algorithm R3
ASR9K

ASR1K
BGP Table 2001:db8:100:100::40/64
2001:db8:100:100::/64
Path #1: via Router_20
Path #2: via Router_3

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
BGP General Operation
Peering
• BGP peers with other BGP speakers
• Peer is also called “neighbor” R2 R_20
• Uses TCP port 179
• BGP peers exchange routes Peering
• Picks the best path
• Installs in the routing/forwarding table
R3
• Advertises to BGP peers via UPDATEs
• UPDATEs have Attributes
• Routing policies tweak attributes to influence best path
selection

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Incremental Updates
• Once BGP sends a route to a peer, it assumes the peer will keep it
• There is no periodic refresh
• New UPDATEs are sent when
• Bestpath change
• Peer bounces
• Route-Refresh

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Autonomous System
AS 10 AS 20
R2 R_20

R1
Internet

AS 40
R_30
R3
AS30 • AS Numbers
• Historically 2 bytes
• A network sharing the same routing policy • 1 to 65535
• Possibly multiple IGPs • 64512 to 65535 are private
• Usually under single administrative control • Running out of AS numbers…

• An AS originates their routes into BGP • RFC 4893


• 4-byte AS number
• Unique AS for every IPv4 address

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
IGP vs. EGP
• IGP – Interior Gateway Protocol
• Exchange routes within an Autonomous Systems
• Limited Scalability
• Sub-second convergence
• EIGRP, ISIS, OSPF etc.

• EGP – Exterior Gateway Protocol


• Exchange routes between Autonomous Systems
• Once was an EGP called “EGP”
• BGP is standard EGP today
• Slower convergence in exchange for scalability

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
eBGP Peering
eBGP - External BGP
• Neighbor in different AS
• Usually directly connected AS 20 AS 40
• Next Hop set to self R_20

External
eBGP Internet

AS #s
(Autonomous System Numbers)

TTL
(Time to Live)
1 (default) R_30
Next Hop Change
AS 30
Directly Connected Check Enabled
(default)

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
eBGP - External BGP
AS 20 AS 40
Configuration

R_20
R_20
router bgp 20
bgp router-id 20.100.100.20
neighbor 5.20.40.40 remote-as 40 Internet
neighbor 5.20.40.40 send-community
Internet
router bgp 40
router-id 40.100.100.40 R_30
neighbor 5.20.40.20 remote-as 20
address-family ipv4 unicast AS 30
send-community
Router_20#sh ip bgp summary
BGP router identifier 20.100.100.20, local AS number 20
<snip>
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
5.20.40.40 4 40 8256 9102 4 0 0 5d17h 3
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
eBGP Multihop
• Peer between loopbacks
• Often used to load-balance traffic over multiple links

R_2 AS 10
router bgp 10
AS 20
neighbor 10.1.20.1 remote-as 20
neighbor 10.1.20.1 update-source loop0
neighbor 10.1.20.1 ebgp-multihop 2 R2 R20
ip route 10.1.20.1 255.255.255.255 s0/0
ip route 10.1.20.1 255.255.255.255 s1/0

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
eBGP Multihop
• Peer between loopbacks
• Often used to load-balance traffic over multiple links

R_2 AS 10
router bgp 10
AS 20
neighbor 10.1.20.1 remote-as 20
neighbor 10.1.20.1 update-source loop0
neighbor 10.1.20.1 disable-connected-check R2 R20
ip route 10.1.20.1 255.255.255.255 s0/0
ip route 10.1.20.1 255.255.255.255 s1/0

Clearing Up Some Misinformation RE: eBGP Multihop and TTL


http://www.networkingwithfish.com/clearing-up-some-misinformation-re-ebgp-multihop-and-ttl/

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
iBGP Peering
iBGP - Internal BGP
R2
• Neighbor in same AS
• NEXTHOP is unchanged AS 10
• Peer to loopbacks
R3
Internal
iBGP
AS #s
=
(Autonomous System Numbers)
TTL
255
(Time to Live)
Next Hop unchanged
Directly Connected Check disabled

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
iBGP - Internal BGP
• Cannot advertise route received from
one iBGP peer to another iBGP peer AS 10
• Full iBGP mesh is required
• n*(n-1)/2 peering mesh – scaling
problem! R2
• Route-Reflectors relax this constraint

R1

R3

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
iBGP – Loopback Peering
• Best Practice
• Loopbacks should be /32s R2
• Have an IGP route to loopbacks
• Configuration AS 10
R_2
router bgp 10
bgp router-id 10.100.100.2
neighbor 10.100.100.3 remote-as 10 R3
neighbor 10.100.100.3 update-source Loopback0
neighbor 10.100.100.3 next-hop-self

R3
router bgp 10
bgp router-id 10.100.100.3
neighbor 10.100.100.2 remote-as 10
neighbor 10.100.100.2 update-source Loopback0
neighbor 10.100.100.2 next-hop-self TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
iBGP – Loopback Peering
• Loopback peering promotes stability
• There are two paths between R1 and R2
R2
• If the link between them fails
• Peering with interface IP would bring down the
BGP session
• Peering to a loopback allows the session to stay R1
up

R3
AS 10

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Attributes and Best Path
Selection Algorithm
Attributes
• IGP
• Primary attribute is a cost/metric
• The path with the lowest metric is the best…nice and easy

• BGP
• Routing Policy between AS is usually more complex
• Shortest path is not necessarily the best one
• Has many attributes to describe reachability to a destination
• The “Best Path Algorithm” compares attributes between different paths to select the best
• Route-policies are used to tweak attributes to influence outcome of Best Path: routing

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
BGP Route vs. BGP Path
• BGP can have multiple paths per route
• Here we have 2 paths to the 40.40.40.0/24 prefix
R2#show ip bgp 40.40.40.0
BGP routing table entry for 40.40.40.0/24
Paths: (2 avail, best #2, table default)
30 40
10.100.100.3 (metric 2) from 10.100.100.3
Origin IGP, metric 0, localpref 100, valid, internal
20 40
20.2.20.20 from 20.2.20.20 (20.100.100.20)
Origin IGP, localpref 100, valid, external, best
Community: 40:1
R2#
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
BGP Route vs. BGP Path
• “show ip bgp summary” provides the total number of routes and paths
• Paths and routes both consume memory
• The more paths you have per route, the more memory consumed
R3#show ip bgp summary
BGP router identifier 10.100.100.3, local AS number 10
BGP table version is 4, main routing table version 4
3 network entries using 432 bytes of memory
6 path entries using 480 bytes of memory
5/3 BGP path/bestpath attribute entries using 800 bytes of memory
2 BGP AS-PATH entries using 48 bytes of memory
1 BGP community entries using 24 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 1784 total bytes of memory
BGP activity 9/6 prefixes, 28/22 paths, scan interval 60 secs
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
BGP Path Selection Algorithm
Attribute Logic
1 Weight Higher is better. Local to the router…not really an attribute.
2 Local Preference Local to an AS…higher is better
3 Locally Originated Corner case…”network 10.0.0.0” vs. “aggregate 10.0.0.0”
vs. “redistribute” on the same router
4 AS-PATH Shorter AS-PATH is better
5 ORIGIN IGP < EGP < Incomplete

6 MED Is often a reflection of IGP metrics so lower is better


7 eBGP vs. iBGP Prefer eBGP path over iBGP path

8 IGP cost to NEXTHOP Lower is better

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
BGP Path Selection Algorithm (contd)
Attribute Logic
9 Lowest Router ID Lower is better

10 Shortest Lower is better


CLUSTER_LIST
11 Lowest neighbor IP Lower is better
address

• All details at http://www.cisco.com/c/en/us/support/docs/ip/border-gateway-


protocol-bgp/13753-25.html

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
BGP Path Selection Algorithm
• Hard to remember“Denise”ism
BGP Attribute
all of that?
Weight Wise

Local Preference Lip

Locally Originated Lovers

AS-PATH Apply

ORIGIN Oral

MED Medication

eBGP vs. iBGP Every

NEXTHOP IGP Cost Night

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
AS 20
Overview R2
ASR1K
EBGP Router_20 ASR1K
40.40.40.0/24

Internet

R2#sh ip bgp 40.40.40.0


IBGP N7K

BGP routing table entry for 40.40.40.0/24


EBGP Router_30 AS 40
Paths: (2 avail, best #2, table default) ASR9K
R3
30 40  as-path ASR1K

10.100.100.3 (metric 2) from 10.100.100.3 AS30


Origin IGP, metric 0, localpref 100, valid, internal AS 10
20 40  as-path Internally learned (iBGP)
20.2.20.20 from 20.2.20.20 (20.100.100.20) Externally learned (eBGP)
Origin IGP, localpref 100, valid, external, best
Community: 40:1  community
R2#
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
Route Origination
Route Origination

AS 40
I am AS 40 and I own 40.40.40.0/24
router bgp 40
Internet
router-id 40.100.100.40
address-family ipv4 unicast
network 40.40.40.0/24

 An AS must “originate” routes for their address space


 Three ways to originate a route…

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Route Origination – Network Statements
• Easiest/Cleanest method
• Network 40.40.40.0 mask 255.255.255.0
• Requires 40.40.40.0/24 to be in the RIB
• Floating static route to Null0 is common
• Originates 40.40.40.0/24
• Easy to determine/control what you are originating

router bgp 40
router-id 40.100.100.40
address-family ipv4 unicast
network 40.40.40.0/24
!
ip route 40.40.40.0 255.255.255.0 Null0 250
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
Route Origination – Network Statements
London-Internet# show ip bgp 40.40.40.0
BGP routing table information for VRF default, address family IPv4
Unicast
BGP routing table entry for 40.40.40.0/24, version 27
Paths: (1 available, best #1)
Flags: (0x080002) on xmit-list, is not in urib
Advertised path-id 1
Path type: local, path is valid, is best path
AS-Path: NONE, path locally originated
0.0.0.0 (metric 0) from 0.0.0.0 (40.100.100.40)
Origin IGP, MED not set, localpref 100, weight 32768
Path-id 1 advertised to peers:
5.20.40.20 5.30.40.30  The Origin is IGP
London-Internet#  Weight is 32768
 “0.0.0.0 from 0.0.0.0”

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Route Origination – Redistribution
• Routes can be redistributed into BGP
• Pros
• Easy to configure and setup
• Cons
• IGP instability is passed along to BGP
• Isn’t always obvious what routes you are originating
• “Redistribute static” is especially dangerous
• What if someone configures a static route for Google’s address space?
• You could blackhole Google’s traffic

•  use route-maps to control what you’re redistributing

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Route Origination – Redistribution
• Things to note
• The nexthop for the OSPF route is 10.1.1.14
• The OSPF metric is 11

R10#show ip route 10.1.1.3


Routing entry for 10.1.1.3/32
Known via "ospf 10", distance 110, metric 11, type intra
area
Redistributing via bgp 10
Advertised by bgp 10
Last update from 10.1.1.14 on Ethernet0/2, 02:56:42 ago
Routing Descriptor Blocks:
* 10.1.1.14, from 10.1.1.3, 02:56:42 ago, via Ethernet0/2
Route metric is 11, traffic share count is 1

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Route Origination – Redistribution
• NEXTHOP uses the IGP nexthop of 10.1.1.14
• ORIGIN is set to “Incomplete”
• “metric” here means MED
• Uses the IGP metric of 11
• Weight is 32768
R10#show ip bgp 10.1.1.3
BGP routing table entry for 10.1.1.3/32, version 5
Paths: (1 available, best #1, table default)
Advertised to update-groups:
9
Local
10.1.1.14 from 0.0.0.0 (10.1.1.2)
Origin incomplete, metric 11, localpref 100, weight
32768, valid, sourced, best TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Route Origination – Aggregation
• Typically used by ISPs to summarize their address space
• Reduces number of routes in global BGP table
• Adds AGGREGATOR attribute
• Contains Router-ID and AS of the router that did the aggregation
• Used for troubleshooting

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Route Origination – Aggregation
• Configure an “aggregate-
address” statement AS 100 Check for component route(s)
R11#show ip bgp 10.1.0.0 255.255.0.0 longer
• BGP table must have 10.1.1.0/24, 10.1.2.0/24, etc listed here
R11
component route(s) R11#

• Components are the longer NLRI: 10.1.1.0/24,


10.1.2.0/24, etc
length prefixes that fall within AS-PATH:10 200 300 400 router bgp 10
the aggregate’s range NLRI: 10.1.0.0/16
AS-PATH: 10
aggregate-address 10.1.0.0 255.255.0.0
AGGREGATOR AS: 10
!
• Use “show ip bgp x.x.x.x AGGREGATOR ID:
10.1.1.1

y.y.y.y longer” to check for


components
• Component routes are still R12
advertised
AS 200
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
Route Origination – Aggregation
• Adding the keyword
summary-only causes BGP AS 100 Check for component route(s)
to suppress R11#show ip bgp 10.1.0.0 255.255.0.0 longer

the components of the R11 10.1.1.0/24, 10.1.2.0/24, etc listed here


R11#
aggregate
NLRI: 10.1.1.0/24,

• Suppressed route: use it, but 10.1.2.0/24, etc


AS-PATH:10 200 300 400 router bgp 10
do not advertise it to any NLRI: 10.1.0.0/16 aggregate-address 10.1.0.0 255.255.0.0 summary-only
AS-PATH: 10
peer AGGREGATOR AS: 10 !
AGGREGATOR ID:
10.1.1.1

R12

AS 200
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
AS-PATH
AS-PATH
• The AS-PATH tells the story of which AS a route has traversed
• A BGP speaker prepends its own AS# to the AS-PATH when
advertising to an eBGP peer
• AS-Path is used for loop detection on the border of the AS
• BGP drops an external update if it sees its own AS in the path
• When viewing the AS-PATH, the most recent AS is on the left, the
originating AS is on the far right
• Shortest AS-PATH is often the tie-breaker for best path selection

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
AS-PATH
AS 10 AS 20 AS 40
R2 R20

Internet

R3 R30

AS 30

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
AS-PATH
R2#show ip bgp 40.1.1.0
BGP routing table entry for 40.1.1.0/24, version 6
Paths: (2 available, best #2, table default)
Advertised to update-groups:
14
Refresh Epoch 1
30 40
10.100.100.3 (metric 2) from 10.100.100.3
(10.100.100.3)
Origin IGP, metric 0, localpref 100, valid, internal
rx pathid: 0, tx pathid: 0
Refresh Epoch 1
20 40
20.2.20.20 from 20.2.20.20 (20.100.100.20)
Origin IGP, localpref 100, valid, external, best
rx pathid: 0, tx pathid: 0x0
R2#

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
NEXTHOP
NEXTHOP
• NEXTHOP is the address that we must route towards in order to reach
the BGP prefix
• Paths where the next-hop is unreachable are not considered for best-path
calculation
• eBGP does “next-hop-self” automatically
• Multiple eBGP peers on the same subnet is an exception

• iBGP does not modify the NEXTHOP by default


• NEXTHOP will remain as the IP of the eBGP peer
• Forces BGP speakers in an AS to have routes for the eBGP facing links
• Would need to carry many /30 eBGP facing links in our IGP
• Best practice is to use “next-hop-self” to avoid this

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
iBGP without next-hop-self
 NEXTHOP does not
change AS 10
 AS 10’s IGP must have R2
route to 30.1.1.9
 Adds many /30s to IGP
R1

E0/0
30.1.1.9
R3 R5

AS 30

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
iBGP with next-hop-self
 R3 changes NEXTHOP
to his “update-source” AS 10
interface R2
 iBGP should always
use loopback peering
 AS 10’s IGP has a R1
route to R3’s loopback
10.1.1.3
E0/0
30.1.1.9
R3 R5

Loop0
10.1.1.3 AS 30

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Communities
Communities
• A COMMUNITY is an attribute that stores a number
• 4-byte number that is usually displayed in X:Y notation
• “ip bgp-community new-format” triggers X:Y notation
• A community by itself does nothing
• Tagging a prefix with 100:1 or 100:2 will not change routing in any way
• Set communities via a route-map
• Communities are not advertised by default

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Sending Communities
R2#
router bgp 20
neighbor 10.1.1.2 remote-as 10 AS 20
neighbor 10.1.1.2 send-community
neighbor 10.1.1.2 route-map TAG_MY_ROUTES out
! R2
ip bgp-community new-format
!
route-map TAG_MY_ROUTES permit 10
AS 10
set community 10:1
! R1
R2#

R3

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Receiving Communities
• Applying Policy towards communities does impact routing
• Use route-maps and community-list to
• Match against a certain community
• Modify a BGP attribute as a result
• LOCALPREF, ASPATH prepending, etc

• You can impact 1000s of prefixes by applying policy based on a single


community

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Communities
R1#
router bgp 10
neighbor 20.1.1.1 description R2_PEER
neighbor 20.1.1.1 route-map R2_OR_R3 in
neighbor 30.1.1.1 description R3_PEER
neighbor 30.1.1.1 route-map R2_OR_R3 in AS 20
ip community-list standard VIA_R2 permit 100:1
ip community-list standard VIA_R3 permit 100:2 R2

route-map R2_OR_R3 permit 10


match community VIA_R2
AS 10
set local-preference 120
route-map R2_OR_R3 permit 20 R1
match community VIA_R3
set local-preference 130
route-map R2_OR_R3 permit 30
R3

AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Well Known Communities
• “A community by itself does nothing”
• There are exceptions to every rule 
• Well Known Communities do have an automatic impact

Community Impact
local-AS Do not send to EBGP peers

no-advertise Do not advertise to any peer

no-export Do not export outside AS/confed

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Controlling Outbound
Traffic
Applying BGP Policy
• Policy based on various attributes:
• ASPATH
• Community
• Destination prefix
• Many, many others…
• Reject/accept selected routes
• Set attributes to influence path selection
• Tools (IOS):
• Distribute-list or prefix-list
• Filter-list (as-path access-list)
• Community-list
• Route-maps (the Swiss army knife)
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
Policy Control - Prefix List
• Per-peer prefix filter, inbound or router bgp 200
outbound neighbor 220.200.1.1 remote-as 210
neighbor 220.200.1.1 prefix-list PEER-IN in
• Allows coverage for ranges of
neighbor 220.200.1.1 prefix-list PEER-OUT out
prefix lengths (ge, le)
!
• Based upon network numbers in ip prefix-list PEER-IN deny 218.10.0.0/16
NLRI (using familiar IPv4 ip prefix-list PEER-IN permit 0.0.0.0/0 le 32
address/mask format) ip prefix-list PEER-OUT permit 215.7.0.0/16
ip prefix-list PEER-OUT deny 0.0.0.0/0 le 32

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Policy Control - Prefix List
a.b.c.d/x [ge | eq | le] y

care vs. don’t care bits


operator
base prefix length to match operand

• ip prefix-list PEER-IN permit 10.0.0.0/8 le 32


All 10.x.x.x subnets, regardless of mask length (e.g. 10.1.2.4/24,
10.1.1.1/32, 10.1.0.0/16)
• 0.0.0.0/0 eq 32 = All /32 prefixes (e.g. 1.2.3.4/32)
• 192.168.1.0/24 = 192.168.1.0/24 eq 24 (ONLY 192.168.1.0/24)
• 172.16.0.0/16 ge 28 = all subnets from 172.16.0.0/16 that have a mask
length of /28 or greater (e.g. 172.16.4.0/28)

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Policy Control - Filter List
• Filter routes based on AS path
• Inbound or Outbound
• Example Configuration:
!
router bgp 100
neighbor 220.200.1.1 filter-list 5 out
neighbor 220.200.1.1 filter-list 6 in
!
ip as-path access-list 5 permit ^200$
ip as-path access-list 6 permit ^150$

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Policy Control - Regular Expressions
• Simple Examples
• .* Match anything
• ^$ Match routes local to this AS (as-path is empty)
• _1800$ Originated by 1800 (as-path ends with 1800)
• ^1800_ Received from 1800 (as-path starts with 1800)
• _1800_ AS 1800 is somewhere in the as-path
• _790_1800_ Passing through 790 then 1800
• 1800 Literal “1800” is somewhere, also matches 21800, 18001, etc.

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
Local Preference
• An attribute used to influence outbound traffic
• Higher LOCAL_PREF is preferred
• Is compared very early in the Best Path Algorithm
• Is local to an AS
• Local preference is never transmitted to an eBGP peer
• A default LP of 100 is applied to routes from eBGP peers

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
Local Preference
• Default behavior…LOCALPREF 100
• R2 and R3 prefer eBGP path
• R1 prefers path from R2 over R3 (lower neighbor IP)
AS 10 AS 20 AS 40
R2 R4

R1
R6

R3 R5

TECRST-1310
AS 30
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Local Preference
• R2 advertises LOCALPREF of 200
• R1, R2, and R3 all prefer the R2 exit
AS 10 AS 20 AS 40
R2 R4

R1
R6

R3 R5

AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Local Preference

AS 10 R2#
!
R2 router bgp 10
neighbor 10.1.1.1 remote-as 10
neighbor 10.1.1.1 route-map SET_LOCAL_PREF out
neighbor 10.1.1.3 remote-as 10
neighbor 10.1.1.3 route-map SET_LOCAL_PREF out
R1 !
route-map SET_LOCAL_PREF permit 10
set local-preference 200
!

Or: set localpref inbound on eBGP session.. There are


R3 always multiple ways to skin a cat :-}

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Alternatives to Local Preference
BGP
• Local preference is a very “heavy” attribute to influence routing, as it Attribute

is evaluated very early in best path algorithm Weight

Local
• Especially with Internet routing, AS path length is very important (how Preference
“far” is the destination) Locally
Originated

• Hence, evaluate attributes for best path manipulation for your design AS-PATH

• No one size fits all, there are lots of ways to implement BGP routing ORIGIN

policies… MED

eBGP vs. iBGP

NEXTHOP IGP
Cost

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
BGP Multipath
BGP Multipath
• R1 receives two paths from AS20 (via R2 and
R3)
AS 20
• Best-path algorithm selects one and installs it
in routing table
• Assuming all attributes are equal, uses the one R4
from the lower neighbour IP address
•  By default, all of the traffic goes via one link
only R1

• We could do some manual load-sharing via


localpref/MED, but that’s cumbersome R5
AS 10

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
eBGP Multipath
• Enable eBGP multipath on R1 to install both
paths
router bgp 10 AS 20
maximum-paths 2
• Multipath selection is part of the Best Path R4
algorithm
• Evaluated before the more arbitrary tie breakers
like IP address/etc. R1
• Only paths with identical ASPATH will be
considered
R5
• Hidden knob “bgp bestpath as-path multipath
relax” changes this, but be aware of what you’re AS 10
doing

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
iBGP Multipath
• In this topology, eBGP Multipath will
not help AS 10 AS 20
• R1 will choose one of the internal
paths, and will select one R2 or R3 R2 R4

• When R1’s IGP cost to R2 and are R3


is equal, and all other path attributes
are the same, iBGP multipath can be R1
used

router bgp 10 R3 R5
maximum-paths ibgp 2

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
Controlling Inbound Traffic
Controlling Inbound Traffic
• The first rule of controlling inbound traffic…
• You do not have ultimate control of how traffic enters your AS
• Your peers may have outbound policies that will override all of your attempts to
influence inbound traffic
• That said, what are your options?
• Leaking more-specific routes
• MED
• AS-PATH Prepending
• Community/Local Pref agreement

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Leaking Specific Routes
• A RIB lookup always looks for the most specific match
• A route for 10.1.1.1/32 will be used over 10.1.1.0/24
• You can leak more specific routes to one ISP but not the other
• If the routes are not filtered this will draw the traffic in through the preferred ISP

• Some argue: Advertising more specifics to the global Internet is not “nice” as it
causes the Internet BGP table to bloat, and everyone has to bear the costs..
• Many ISPs filter routes that are too specific
• You can’t advertise /32s for your entire address space
• These will obviously be filtered

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Leaking Specific Routes

• You are AS 10
AS 10 10.1.1.0/24
AS 20
• AS 10 owns R2 R4
10.1.1.0/24
• AS 20 only uses one
link to send traffic to
R1
AS 10 
• You want to utilize
both links
R3
Traffic

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
Leaking Specific Routes
• Split your /24 in two /25s
• R2
• advertise 10.1.1.0/25
• suppress 10.1.1.128/25 10.1.1.0/25
AS 10 AS 20
• R3
R2 R4
• suppress 10.1.1.0/25
• advertise 10.1.1.128/25
• AS 20 will now send traffic R1
on both links

• Q: What are the problems R3


with this policy?
Traffic
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
Leaking Specific Routes

• Will inbound traffic split


50/50 on your two links?
10.1.1.0/25

• Maybe…maybe not AS 10 AS 20
www.espn.com
R2 R4
• In this case the R2 link will 10.1.1.10
receive much more traffic
than the R3 link
R1

www.watching-
paint-dry.com R3
10.1.1.140
Traffic

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
MED
• Officially “Multi Exit
Discriminator”
• An attribute used to
influence inbound traffic 10.1.2.0/24
AS 10 MED: 1 AS 20
• Lower MED is better 10.1.3.0/24
MED: 2
• MED is designed to be a R1 R2 R5
reflection of IGP metrics
• A lower IGP metric is
always preferred 10.1.2.0/24

• Therefore lower MED is


preferred 10.1.3.0/24

• Used to bring traffic into


the AS on the eBGP R4 R3
speaker closest to the
destination

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
MED
MEDs can be set manually
“set metric-type internal” sets MED dynamically
Uses IGP cost to prefix as the MED value
R2 has an IGP cost of 1 to 10.1.2.0
10.1.2.0/24
R2 has an IGP cost of 2 to 10.1.3.0 AS 10 MED: 1 AS 20
10.1.3.0/24
MED: 2
R1 R2 R5
R2#
router bgp 10
neighbor 10.1.1.5 remote-as 20 10.1.2.0/24
neighbor 10.1.1.5 SET_MED out
!
route-map SET_MED permit 10 10.1.3.0/24
set metric-type internal
!
R4 R3

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
MED
• Traffic for 10.1.2.0/24 uses the R2 link
• Traffic for 10.1.3.0/24 uses the R3 link
10.1.2.0/24
AS 10 MED: 1 AS 20
10.1.3.0/24
MED: 2
R1 R2 R5
10.1.2.1

10.1.2.0/24

10.1.3.0/24

R4 R3
Traffic
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
MED – bgp always-compare-med
• MEDs are only compared if received from the same AS
• Makes sense as you can’t necessarily compare routing policies across different AS
• R6 does not compare MEDs for the paths received from AS20 and AS30 unless “bgp
always-compare-med” is configured
AS 10 AS 20 AS 40
R2 R4

R1
R6

R3 R5

AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
AS-PATH Prepending
• AS 10 can force traffic into R3 by prepending from R2  R4
• A shorter ASPATH is preferred
AS 10 AS 20 AS 40
R2 R4

R1
R6

R3 R5

AS 30
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
AS-PATH Prepending
R2#
router bgp 10
neighbor 10.1.1.4 remote-as 20
neighbor 10.1.1.4 route-map PREPEND_3X out
!
route-map PREPEND_3X permit 10
set as-path prepend 10 10 10
!

AS 10 AS 20
R2 R4

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
Community/Local Pref Agreement
• Many providers accept communities
from their customers to give customers
some control on inbound traffic.
AS 10 AS 20
• Example R1 R3
• Customer sends community 20:80, ISP
sets the LOCALPREF to 80
• Customer sends community 20:120, ISP
sets the LOCALPREF to 120

R2 R4

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Community/LOCALPREF Agreement
R1#
router bgp 10
neighbor 10.1.1.3 remote-as 20
neighbor 10.1.1.3 route-map SET_COMMUNITY out
neighbor 10.1.1.3 send-community
! AS 10 AS 20
route-map SET_COMMUNITY permit 10
set community 20:120 R1 R3
!

R2 R4

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Community/LOCALPREF Agreement
R3#
router bgp 20
neighbor 10.1.1.1 remote-as 10
neighbor 10.1.1.1 route-map COMMUNITY_TO_LOCALPREF in
!
ip community-list standard LP_80 permit 20:80 AS 20
ip community-list standard LP_120 permit 20:120
! R1 R3
route-map COMMUNITY_TO_LOCALPREF permit 10
match community LP_80
set local-preference 80
! AS 10
route-map COMMUNITY_TO_LOCALPREF permit 20
match community LP_120
set local-preference 120
!
route-map COMMUNITY_TO_LOCALPREF permit 30
! R2 R4

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Enterprise Multi-Homed
Internet Edge
Architectures
Michael Kowal, BRKRST-2044

Intern
• Single Router, 1 Link
ISP A et ISP B
• Single Router, 2 Links (Equal and Unequal BW)

Ingress
• Multiple Routers, Multiple Links (Equal and Unequal BW)

Egress
• Multiple Routers, Multiple Firewalls, Multi-Site

R1 R2

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Troubleshooting BGP
Vinit Jain
BRKRST-2044

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Agenda
• Controlling Traffic
• BGP General Operation • Controlling Outbound Traffic
• BGP Multipath
• Overview
• Controlling Inbound Traffic
• eBGP
• iBGP • Route Reflectors
• Attributes and Best Path Selection Algorithm • Convergence
• Route Origination • Initial Convergence
• AS-PATH • BGP Routing Convergence
• NEXTHOP
• High Availability
• Communities
• Show and Tell/Demo Lab
Route Reflectors
• A route received from one iBGP peer will
NOT be advertised to another iBGP peer
AS 10
• Full iBGP mesh is required

• n*(n-1)/2 peering mesh – scaling problem!

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Route Reflectors
• A route received from one iBGP peer will
NOT be advertised to another iBGP peer
AS 10
• Full iBGP mesh is required R1

• n*(n-1)/2 peering mesh – scaling problem!


RR

R2

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Route Reflector Basics

• A route reflector is an iBGP speaker that reflects


Route reflectors
routes learned from iBGP peers to other iBGP peers
• Route reflectors are designated by configuring some
of their iBGP peers as route reflector clients

neighbor <A> route-reflector-client


neighbor <B> route-reflector-client
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
Route Reflector Basics
Route reflectors
• A route reflector client is just an iBGP
speaker
• There is no special configuration for a route
reflector client

Route reflector client


TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Route Reflector Basics
• A cluster is a route reflector and its Route reflectors
clients
• Route reflector clusters may overlap
Cluster

Route reflector client

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
Route Reflector Basics
• A non-client is any iBGP peer that is Route reflectors
not a route reflector client
Non-client
• Each route reflector is also a non-
client of each other route reflector in Cluster
this network
• Route reflectors must be fully iBGP
meshed
A

Route reflector client

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
Route Reflector – Advertisement Rules
eBGP peer
If a Route Reflector Receives a Route
from an eBGP Peer what will it do?
RR Send
• Send the route to ALL BGP peers (iBGP
and eBGP) Send
Send
Non-client
iBGP peer
Client
Client

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
Route Reflector – Advertisement Rules
If a Route Reflector Receives a Route
eBGP peer
from a Client what will it do?
Send
• Reflect the route to all clients Reflect
RR
• Reflect the route to all non-clients
Reflect
Non-client
• Send the route to all eBGP peers iBGP peer
Client
Client

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
Route Reflector – Advertisement Rules
Non-client
If a Route Reflector Receives a Route iBGP peer
eBGP peer
from a Non-Client what will it do?
Send
• Reflect the route to all clients
RR
• Send the route to all eBGP peers Reflect
Reflect
Non-client
iBGP peer
Client
Client

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
Route Reflector Design and Redundancy
A client may peer with more than one reflector

• A client that peers to only one reflector has a single point of failure

• Clients should peer to at least two reflectors to provide redundancy

Questions:
• How many reflectors should a single client be peered to?
• Where should the RRs be placed in the network?
• How many RRs are needed?

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
Route Reflector Design and Redundancy
• Redundancy is needed but….
• Too much burns memory on RRCs because the client learns the same
information from each RR
• Also burns memory on the RRs because they learn multiple paths for each route
introduced by a RRC
• Two route reflectors per client should be plenty…
• …but this is not a hard and fast rule
• As with everything else…”it depends”
• PEs, RRs, SLAs, network size, network topology, etc.

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
A word of reason

• Most routers sold in the last decade can easily run 100 or more sessions (all
depends on number of prefixes carried)
• ASR1000-RP2 scales to thousands of sessions (Isocore tested 20 Million routes
with 1000 RR clients)
• So RP performance is often not the limiting factor of a full iBGP mesh, it’s rather
the manageability adding/removing nodes from the mesh
• So don’t over-engineer it…

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
Dynamic Neighbors
• Remote peers are defined by IP address range
• Less configuration for defining neighbors
• Remote initiate BGP session
• Enterprise networks (DMVPN, ...)
1

• iBGP and limited eBGP (limited nr of ASNs) iBGP


n
iBGP iBGP

router bgp 1 R1

bgp listen range 10.1.1.0/24 peer-group SPOKES


bgp listen range 192.168.0.0/16 peer-group EBGP-PEERS eBGP eBGP eBGP
bgp listen limit 1000
neighbor SPOKES peer-group
neighbor SPOKES remote-as 1 1 n
neighbor SPOKES update-source Loopback0
neighbor SPOKES route-reflector-client
neighbor EBGP-PEERS peer-group
neighbor EBGP-PEERS remote-as 2 alternate-as 3 4 5 6 7
... © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Multi-Protocol BGP
and some Use Cases
Routing Protocol Reachability Announcements
Hey, I am your EIGRP
neighbor and you can reach
• Interiour routing protocols 10.1.1.0/24 through me using
metric X
announce networks and
topology on how to reach them Hello everyone! I am OSPF
• Network reachability and topology router John, here are my
information is often closely OSPF neighbours so you know
how to find me, and I also own
coupled 10.20.2.0/24

• BGP also advertises network


reachability, but leaves out how Good day! Glad you’re
speaking BGP with me. I can
to reach the Next Hop tell you where to reach IPv4
• This allows to extend the network 10.30.0.0/16, or
announcements to much more would you rather learn about
than IP destinations, using the IPv6 networks or MAC
addresses?
exact same protocol

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 104
Control-plane Evolution
• Many services are Service/transport 200x and before 2013 and future
moving towards BGP to IDR (Peering) BGP BGP (IPv6)
disseminate control- SP L3VPN BGP BGP + FRR + Scalability
plane information SP Multicast VPN PIM BGP Multicast VPN

• Operator’s and DDOS mitigation CLI BGP flowspec


Designer’s familiarity Network Monitoring SNMP BGP monitoring protocol
with BGP is an Security Filters
BGP Sec (RPKI), DDoS
important factor Mitigation
Proximity BGP connected app API
• But so is policy control, SP-L3VPN-DC BGP Inter-AS, VPN4DC
scale and the
Business & CE L2VPN LDP BGP PW Sign (VPLS)
extensibility of the
protocol DC Interconnect L2VPN BGP MAC Sign (EVPN)
MPLS transport LDP BGP+Label (Unified MPLS)
Data Center OSPF/ISIS BGP + Multipath
Massive Scale DMVPN NHRP / EIGRP BGP + Path Diversity
Campus/Ent L3VPN BGP BGP

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 105
BGP Address and Sub-Address Families
• BGP can advertise multiple Network Protocols’ reachability information  Multi-Protocol-BGP
• Address Family (AF): Network Protocol Type (ex: IPv4, IPv6, CLNS, etc.)
• Subsequent Address Family (SAF): Additional semantic to the above, for example unicast or multicast,
MPLS VPN addresses, etc.
• Some examples (far from exhaustive):

SAF SAF
AFI Description AFI Description
I I
1 1 IPv4 Unicast 1 128 L3VPN IPv4 unicast

1 2 IPv4 Multicast 1 129 L3VPN IPv4 multicast

1 4 Labeled IPv4 3 128 CLNS VPN


1 133 Flow-Spec
2 1 IPv6 Unicast
25 65 BGP-VPLS
2 2 IPv6 Multicast
25 70 EVPN
2 4 Labeled IPv6 (aka 6PE)
16388 71 BGP Link-State
See http://www.iana.org/assignments/address-family-numbers/address-family-numbers.xhtml and http://www.iana.org/assignments/safi-namespace/safi-
namespace.xhtml for full list. not all denote to AFI/SAFI used in BGP © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
MP-BGP: Address-Family-Identifier Syntax
Original syntax AFI/SAFI syntax
router bgp 20
router bgp 20 bgp router-id 20.100.100.20
bgp router-id 20.100.100.20 bgp log-neighbor-changes
bgp log-neighbor-changes neighbor 5.20.40.40 remote-as 40
neighbor 5.20.40.40 remote-as 40 neighbor 5.20.40.40 password foobar
neighbor 5.20.40.40 password foobar !
neighbor 5.20.40.40 send-community address-family ipv4
network 20.100.100.20 mask 255.255.255.255 network 20.100.100.20 mask 255.255.255.255
neighbor 5.20.40.40 activate
neighbor 5.20.40.40 send-community
exit-address-family
Note: We can !
advertise multiple address-family vpnv4
neighbor 5.20.40.40 activate
AFI/SAFI over the neighbor 5.20.40.40 send-community extended
same session!

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 107
MPLS-VPN– L3VPN
• Layer 3 VPN carries customer routing information across an MPLS core
• Customer addresses can overlap (think: multiple enterprise customers all
using 10.0.0.0/8)
• Problem: How do we differentiate them on the control plane (BGP) and on
the forwarding plane (within the backbone)

CE
10.1.1.0/24 CE

PE 1 PE 2

CE
CE
10.1.1.0/24
MPLS Backbone

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 108
MPLS-VPN– L3VPN
• Solution:
1. Control Plane: Make addresses distinguishable by adding an addtl. identifier:
Route Distinguisher (RD): 555:9876:10.1.1.0/24
2. Forwarding Plane: Add routing contexts on PEs, and carry packets as MPLS labeled
packets across the backbone.
• BGP advertises VPNv4 addresses (8 byte RD + 4 byte IPv4 addresses) and a label as NLRI
• Other BGP attributes (AS-Path, Next-Hop, etc.) are included as seen before
NRLI: 555:987610.1.1.0/24
Label: 345
Next-Hop: PE1
... CE
10.1.1.0/24 CE MP-iBGP session
PE 1 PE 2

CE
CE
10.1.1.0/24
MPLS Backbone

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 109
BGP Routing Convergence:
Initial Convergence
BGP Convergence

• Two general convergence situations

• Initial startup

• Reaction to network failure events

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
Convergence: Initial Startup
Initial convergence happens when:
• A router boots
• RP failover
• clear ip bgp *

How long initial convergence takes is a factor of the amount of work


to be done and the router/network’s ability to do this fast and
efficiently

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 112
Convergence: Initial Startup
Question: During initial convergence, what work needs to be done?
• Accept routes from all peers
• Not too difficult

• Calculate bestpaths
• This is pretty easy

• Install bestpaths in the RIB


• Also fairly easy

• Advertise bestpaths to all peers


• This can be difficult and may take several minutes depending on the following variables…

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 113
Convergence: Key Variables
• BGP Variables
• The number of routes
• The number of peers
• The number of update-groups
• The ability to advertise routes to each peer/update-group efficiently

• Router Variables
• CPU horsepower
• Code version
• Interface bandwidth and input & output queues

• Network Variables
• Health of underlying network and transport
• MTU
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 114
Convergence: Key Variables
• BGP Variables
• The number of routes
• The number of peers
• The number of update-groups
• The ability to advertise routes to each peer/update-group efficiently

• Router Variables
• CPU horsepower
• Code version
• Interface bandwidth and input & output queues

• Network Variables
• Health of underlying network and transport
• MTU
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 115
Convergence: UPDATE Packing
• UPDATE contains a set of Attributes and a list of prefixes (NLRI)
• BGP starts an UPDATE by building an attribute set
• BGP then packs as many destinations (NLRIs) as it can into the UPDATE
• Only NLRI with a matching attribute set can be placed in the UPDATE
• NLRI are added to the UPDATE until it is full (4096 bytes max)

Least Efficient MED 50 10.1.1.0/24 MED 50 10.1.2.0/24 MED 50 10.1.3.0/24


Origin IGP Origin IGP Origin IGP

Most Efficient MED 50 10.1.1.0/24  “UPDATE Packing” refers to how efficiently


Origin IGP 10.1.2.0/24 an implementation packs NLRIs into
10.1.3.0/24
UPDATEs

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 116
Convergence: UPDATE Packing
• The fewer attribute sets you have the better
• More NLRI will share an attribute set
• Fewer UPDATEs to converge
• Things you can do to reduce attribute sets
• next-hop-self for all iBGP sessions
• Don’t accept/send communities you don’t need

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 117
Convergence
TCP MSS – Max Segment Size
TCP MSS (max segment size) is also a factor in convergence times. The larger the
MSS the fewer TCP packets it takes to transport the BGP updates. Fewer packets
means less overhead and faster convergence.

BGP UPDATE Attribute NLRI ..NLRIs.. NLRI ..NLRIs.. NLRI

Default MSS IP Header TCP Header Attribute NLRI ..NLRIs..

BGP UDPATE is split IP Header TCP Header NLRI ..NLRIs.. NLRI


into two TCP packets

Increased MSS IP Header TCP Header Attribute NLRI ..NLRIs.. NLRI ..NLRIs.. NLRI

The entire BGP update


can fit in one TCP packet

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 118
Convergence
TCP MSS – Max Segment Size
• MSS – Max Segment Size
• Limit on packet size for a TCP socket
• 536 bytes by default
• Path MTU Discovery
• Finds smallest MTU between R1 and R2
• Subtract 40 bytes for TCP/IP overhead
• Enabled by default for BGP (at least in recent releases)
• In older releases enable via global cmd “ip tcp path-mtu-discovery”
• To find the MSS
R1#sh ip bgp neighbors
BGP neighbor is 2.2.2.2, remote AS 3, external link
Datagrams (max data segment is 1460 bytes):

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 119
Convergence
Update Groups
• BGP must create updates based on the policies
towards each peer Less Efficient – Two peers in different
update-groups
• Peers with a common outbound policy are Attribute NLRI NLRI
members of the same update-group
• iBGP vs. eBGP Attribute NLRI NLRI
• Outbound route-map, prefix-lists, etc

• UPDATEs are generated for one member of an More Efficient – Two peers in
update-group and then replicated to the other the same update-group
members
Attribute NLRI NLRI
• Back in the old days, these “update-groups” had to
be created specifically, using “peer-groups”.
They’re still widely deployed…

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 120
Convergence: Key Variables
• BGP Variables
• The number of routes
• The number of peers
• The number of update-groups
• The ability to advertise routes to each peer/update-group efficiently

• Router Variables
• CPU horsepower
• Code version
• Interface bandwidth and input & output queues

• Network Variables
• Health of underlying network and transport
• MTU
TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 121
Convergence
Dropping TCP Acks
Primarily an issue on RRs (Route Reflectors) with RR
• One or two interfaces connecting to the core
• Hundreds of RRCs (Route Reflector Clients) BGP UPDATEs

• RR sends out tons of UPDATES to RRCs


• RRCs send TCP ACKs
TCP ACKs
• RR core facing interface(s) receive huge wave of
TCP ACKs
RRCs

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 122
Convergence
Dropping TCP Acks
• Interface input queue fills up…TCP ACKs are dropped 
• Each time a TCP packet is dropped, the session goes into slow start
• It takes a good deal of time for a TCP session to come out of slow start
• Increase the input queue
• hold-queue 1000 in
• If you still see drops increase to 4096

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
Convergence
Question: How do you know if BGP has converged?
Answer: BGP Table Version

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 124
BGP Table Version

• Understanding the BGP Table Version – Part 1: Introduction to BGP Table Version
http://www.networkingwithfish.com/understanding-the-bgp-table-version-part-1-introduction-to-bgp-table-version/

• Understanding the BGP Table Version – Part 2: BGP Table Version in Action
http://www.networkingwithfish.com/understanding-the-bgp-table-version-part-2-bgp-table-version-in-action/

• Understanding the BGP Table Version – Part 3: BGP Table Version & Troubleshooting
http://www.networkingwithfish.com/understanding-the-bgp-table-version-part-3-bgp-table-version-troubleshooting/

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 126
Convergence
Initial Convergence Summary
• Initial convergence time is a factor of the amount of work that needs to be done
and the router/network’s ability to do this fast and efficiently
• Reduce the number of attributes sets in BGP
• Use next-hop-self, don’t send/accept communities you don’t need, etc.
• Reduce the number of unique outbound policies towards all peers
• Try to find a small set of common policies, rather than individualizing policies per peer
• The fewer update-groups the better
• MSS/PMTU
• Efficient packaging of BGP messages in TCP
• Stop TCP ACK drops
• Increase interface input queues on RRs

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 127
BGP Routing Convergence:
Reaction to Failure
IGP vs. BGP Convergence
• IGP (OSPF/ISIS) deals with hundreds routes
• Max a few thousands, but only a few hundreds are really important/relevant

• BGP is designed to carry millions of routes


• and these days several customers carry that amount of prefixes!

• We can tune IGPs to converge in << 1 second


• But how about BGP?

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 129
BGP Control-Plane Convergence Components
• Failure Detection
• Reaction to Failure
• Failure Propagation

Convergence =

Failure Detection + Event Propagation + Routing Process + FIB Update

Neighbor Down Tell Neighbors RIB + CEF + Hardware

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 130
Failure Detection (Edge)
• Problem: Detect an eBGP neighbour
failure
• Available Methods router bgp …
[no] bgp fast-external-fallover
• Fast External Fallover – monitors line interface …
protocol for directly connected ip bgp fast-external-fallover {permit|deny}
neighbours (default behaviour)

• Fast Session Deactivation (FSD), router bgp …


neighbor x.x.x.x fall-over
monitors routing table for reachability
of next-hop address (eBGP multi-hop)
router bgp …
timers bgp <hello> <hold>
• “Hello”-type protocols: BFD and BGP neighbor <..> timers <hello> <hold>
Hello
neighbor <..> fall-over bfd

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 131
Failure Detection – Next-Hop Failure
• Goal: Detect next-hop failures (as carried in IGP)
• Methods:
• Next-hop Tracking, enabled by default
• BGP scanner (legacy, very slow reaction)

• Note: On most cases, we do not want to use iBGP hellos to detect internal/iBGP
neighbor failures, and instead rely on next-hop reachability checks

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 132
Control vs. Data Plane Convergence – BGP PIC
• Control Plane Convergence
• For the topology after the failure, the optimal path is known and installed in the
dataplane
• May be extremely long (depends on number of prefixes carried)

• Data Plane Convergence


• Once IGP convergence has detected the failure, the packets are rerouted onto a valid
path to the BGP destination
• W hile valid, this path may not be the most optimum one from a control plane
convergence viewpoint
• BGP PIC can deliver this behaviour, in a prefix-independent way, no matter if BGP
carries 1000 or 1,000,000 prefixes!

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 134
Deploying BGP Fast
Convergence / BGP PIC
Oliver Boehmer, BRKIPM-2265
BGP Prefix Independent P1
PE1

Convergence (BGP PIC) PE3 PE2

P2
BGP Net
110.0.0.0/24
IGP pathlist
BGP Net
110.1.0.0/24 PE1 via P1 Gig1, dmac=x
BGP pathlist PE1 via P2

… PE1
PE2 IGP pathlist
Gig2, dmac=y

BGP nexthop(s) PE2 via P2


BGP Net
110.5.0.0/24
IGP nexthop(s) Output Interface

• Pointer Indirection between BGP and IGP entries allow for immediate update of the multipath BGP pathlist at
IGP convergence

• Only the parts of FIB actually affected by a change needs to be touched

• Used in newer IOS and IOS-XR (all platforms), enables Prefix Independent Convergence
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Wrapping up

... or why we think BGP is close to the


best thing since sliced bread
Wrapping up...

• BGP is old, but proven! And scalable!


• BGP is THE protocol to implement complex routing policies, but comes
with enough rope to hang yourself
• BGP can extend to carry much more than IP reachability information!
• BGP can’t live alone, it usually requires an underlying IGP
• We haven’t touched on this today, but BGP can react very fast to failures
• BGP requires hands-on and practice, so lets dive right into it...

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 137
“Show and Tell”
Loop0 Loop0
10.100.100.2 20.100.100.20
AS 10 2001:db8:2:2::2/128 2001:db8:20:20::20/128
20.2.20.0
G0/3 G0/0/0
R2 R20
IOSv .2 2001:db8:2:20:: .20 IOS XE
G0/0/1
10.1.2.0 G0/1
G0/2 .20
2001:db8:1:2:: .2

Gig0/1
.2
AS 20 G0/1
AS 40
.1 .40
R1 10.2.3.0
2001:db8:2:3:: Internet
Loop0 G0/2 IOSv
10.100.100.1
2001:db8:3:3::3/128
10.1.3.0 G0/1 G0/2 AS30 G0/2
.40
Loop0
40.100.100.40
2001:db8:1:3:: .3 .3
2001:db8:40:40::40/128
G0/3 G0/0/0/0
R3 R30 G0/0/0/1
IOSv .3 .30 IOS XR .30
30.3.30.0
Loop0 2001:db8:3:30::
Loop0
10.100.100.3
30.100.100.30
2001:db8:3:3::3/128
2001:db8:30:30::30/128

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 139
BGP Show and Tell: Beginners
https://www.youtube.com/playlist?list=PLVuziKl5zsd6VW41lIl3SWC3nT1oISZBj

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 140
Complete Your Online Session Evaluation
• Please complete your Online
Session Evaluations after each
session
• Complete 4 Session Evaluations &
the Overall Conference Evaluation
(available from Thursday) to receive
your Cisco Live T-shirt
• All surveys can be completed via
the Cisco Live Mobile App or the
Don’t forget: Cisco Live sessions will be available
Communication Stations for viewing on-demand after the event at
CiscoLive.com/Online

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 141
Cisco Spark
Ask Questions, Get Answers, Continue the Experience

Use Cisco Spark to communicate with the Speaker and fellow


participants after the session

Download the Cisco Spark app from iTunes or Google Play


1. Go to the Cisco Live Berlin 2017 Mobile app
2. Find this session
3. Click the Spark button under Speakers in the session description
4. Enter the room, room name = TECRST-1310)
5. Join the conversation!

The Spark Room will be open for 2 weeks after Cisco Live

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 142
Continue Your Education
• Demos in the Cisco campus
• Walk-in Self-Paced Labs
• Lunch & Learn
• Meet the Engineer 1:1 meetings
• Related sessions

TECRST-1310 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 143
Thank You

You might also like