0% found this document useful (0 votes)
69 views

BGP Best Path Selection Algorithm With Examples

BGP uses a multi-step process to select the best path for a prefix. It first checks if the path is valid, then uses an algorithm to select the best path among multiple options. The algorithm prefers paths with the highest weight, local preference, shortest AS path length, and lowest IGP metric to the next hop. Local preference is transmitted between iBGP peers and can be used to influence the best path selection.

Uploaded by

PA2 kspl
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

BGP Best Path Selection Algorithm With Examples

BGP uses a multi-step process to select the best path for a prefix. It first checks if the path is valid, then uses an algorithm to select the best path among multiple options. The algorithm prefers paths with the highest weight, local preference, shortest AS path length, and lowest IGP metric to the next hop. Local preference is transmitted between iBGP peers and can be used to influence the best path selection.

Uploaded by

PA2 kspl
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 11

BGP Best Path Selection Algorithm with examples

BGP is the protocol used to announce prefixes throughout the internet. It’s a very robust
protocol, and very useful to carry lot of prefixes, such as the Internet prefixes or internal
client prefixes of an ISP.

When a prefix is received in BGP, the path passes through two steps before being chosen as
candidate to populate the RIB.

The first step consists on checking if the path is valid. If it is, the prefix will get into the
BGP table, and later the second step of selection will start.

In order to pass this first check, the path must meet the following requirements:

 The prefix must not been marked as “not-synchronized”


 There must be a route in the RIB to reach the next-hop
 For prefixes learned through eBGP sessions, the local ASN must not be in the
AS_PATH of the prefix

In the second step, the best path to reach the prefix is selected. If there is only one path,
no comparison needed. If there are many paths to reach the prefix, there is a special algorithm
that BGP uses to select the best path, and this is what I want to talk about.

This algorithm dictates the following:

1. Prefer the path with the highest WEIGHT


2. Prefer the path with the highest LOCAL PREFERENCE
3. Prefer the path that was locally originated via a network o redistribute command
over aggregate-address command
4. Prefer the path with the lowest AS_PATH
5. Prefer the path with the lowest ORIGIN type
6. Prefer the path with the lowest MULTI-EXIT DISCRIMINATOR (MED)
7. Prefer eBGP over iBGP
8. Prefer the path with the lowest IGP metric to the BGP next-hop
9. When both path are external, prefer the one that was received first
10. Prefer the route that comes from the BGP router with the lowest router ID
11. If the originator or router ID is the same for multiple paths, prefer the path with the
minimum cluster list length
12. Prefer the path that comes from the lowest neighbor address
As you can see, the selection process is quite long, although in most  cases the selection
doesn’t go further than point 8.

Let’s study points  1 through  8 and how we can influence them within the following lab. The
prefix we are going to be working with is 100.100.100.0/24, announced by R4 and R6:

1.- PATH WITH HIGHEST WEIGHT

Weight is a Cisco-specific attribute, that means it’s not standard. This attribute is local to the
router on witch it’s configured, so it’s not advertised with the prefix to other peers. This
attribute is used to tell the router which path to use to reach the prefix. The highest value
wins.

It’s the first attribute checked by BGP, so if there are two different paths for the same prefix
but with different Weight values, the path with the highest value wins.

In the lab scenario, R4 and R6 both announce the prefix 100.100.100.0/24, one through an
eBGP session and other through an iBGP session. Let’s check how R2 and R1 see this prefix
without changing anything:

R2#show ip bgp
BGP table version is 3, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
* 100.100.100.0/24 4.4.4.4 0 0 65002 i
*>i 6.6.6.6 0 100 0 i

R2#show ip bgp 100.100.100.0/24


BGP routing table entry for 100.100.100.0/24, version 3
Paths: (2 available, best #2, table default)
Advertised to update-groups:
13 16
65002
4.4.4.4 (metric 11) from 4.4.4.4 (4.4.4.4)
Origin IGP, metric 0, localpref 100, valid, external
Local
6.6.6.6 (metric 11) from 6.6.6.6 (6.6.6.6)
Origin IGP, metric 0, localpref 100, valid, internal, best

R2 gets two paths for the prefix 100.100.100.0/24: one of them from an eBGP peer and the
other one from an iBGP peer. So R2 doesn’t choose the path through the eBGP peer, as we
could think initially as the Administrative Distance for eBGP is less than for iBGP, but that’s
not what really happens.

R2  picks the one from the iBGP peer as the best one, because as we will see later,  it’s the
one with the shortest AS_PATH length. Both paths (through R4 and through R6) have the
same weight, local-preference and route origin. So the tie-breaker is the  shorter AS_PATH,
that is the path through R6.

Let’s see what happens when the weight parameter is configured on R2:

R2#conf term
R2(config)#router bgp 65001
R2(config-router)#neig 4.4.4.4 weight 200
R2(config-router)#end
R2#clear ip bgp 4.4.4.4

R2#sh ip bgp
BGP table version is 4, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 100.100.100.0/24 4.4.4.4 0 200 65002 i
* i 6.6.6.6 0 100 0 i

Now R2 takes the path through R4. And it announces this path to R1 as its own choice, but
we said the weight attribute is not attached to the prefix, so if R1 had a BGP session with R6,
it would prefer the path through R6 as R2 did at the beginning.

Let’s build this BGP session between R1 and R6, and let’s see which path R1 chooses:

R1#sh ip bgp sum


BGP router identifier 1.1.1.1, local AS number 65001
....
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down
State/PfxRcd
2.2.2.2 4 65001 30 30 14 0 0 00:24:37
1
6.6.6.6 4 65001 4 3 14 0 0 00:00:31
1

R1#sh ip bgp 100.100.100.0/24


BGP routing table entry for 100.100.100.0/24, version 14
Paths: (2 available, best #1, table default)
Not advertised to any peer
Local
6.6.6.6 (metric 21) from 6.6.6.6 (6.6.6.6)
Origin IGP, metric 0, localpref 100, valid, internal, best
65002
4.4.4.4 (metric 21) from 2.2.2.2 (2.2.2.2)
Origin IGP, metric 0, localpref 100, valid, internal
R1#

Although R2 prefers the path through R4, R1 prefers the path through R6 because it has a
shorter AS_PATH.

So as I said before, the weight attribute only has local significance, and it’s not attached to
the prefix when announced via BGP.

2.- PATH WITH HIGHEST LOCAL-PREFERENCE

When all the paths to the destination have the same weight value, the next attribute to be
checked is Local-Preference.

Local-preference is a standard attribute, and it’s transmitted only between iBGP peers.

This parameter is set to outgoing or incoming prefixes by using a route-map with the peer. If
there isn’t any statement matching a specific prefix inside the route-map, the local-preference
is set for all the prefixes outgoing or incoming for that peer. The highest value wins.

Let’s get back to the original scenario. R4, R3, and R6 are announcing the same
100.100.100.0/24 prefix. But, R3 is announcing this prefix with a local-preference of 150:

R2#sh ip bgp
BGP table version is 7, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
 r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*>i100.100.100.0/24 3.3.3.3 0 150 0 i
* 4.4.4.4 0 0 65002 i
* i 6.6.6.6 0 100 0 i

R2#sh ip bgp 100.100.100.0/24


BGP routing table entry for 100.100.100.0/24, version 7
Paths: (3 available, best #1, table default)
Flag: 0x800
Advertised to update-groups:
13 18
Local, (Received from a RR-client)
3.3.3.3 (metric 11) from 3.3.3.3 (3.3.3.3)
Origin IGP, metric 0, localpref 150, valid, internal, best
65002
4.4.4.4 (metric 11) from 4.4.4.4 (4.4.4.4)
Origin IGP, metric 0, localpref 100, valid, external
Local, (Received from a RR-client)
6.6.6.6 (metric 11) from 6.6.6.6 (6.6.6.6)
Origin IGP, metric 0, localpref 100, valid, internal

It makes R2 select the path through R3 as the best choice, and announce this choice to other
iBGP neighbors, as we can see in R1:
R1#sh ip bgp 100.100.100.0/24
BGP routing table entry for 100.100.100.0/24, version 17
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
3.3.3.3 (metric 11) from 2.2.2.2 (2.2.2.2)
Origin IGP, metric 0, localpref 150, valid, internal, best
Originator: 3.3.3.3, Cluster list: 2.2.2.2

As we can see, the value of Local-Preference is attached to the prefix.

In order to change this decision, we can configure a route-map in R2 with a higher local-
preference value and apply it to the session with R6. After resetting the session with R6 on
R2, the prefix announced by R6 will have the highest local-preference value, so R2 will
choose this new path. At the same time it would be announced this way to their clients:

R2#configure t
R2(config)#route-map LP-200
R2(config-route-map)#set local-preference 200
R2(config-route-map)#exit
R2(config)#router bgp 65001
R2(config-router)#neig 6.6.6.6 route-map LP-200 in
R2(config-router)#end
R2#clear ip bgp 6.6.6.6
R2#sh ip bgp
BGP table version is 8, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*>i100.100.100.0/24 6.6.6.6 0 200 0 i
* i 3.3.3.3 0 150 0 i
* 4.4.4.4 0 0 65002 i
R1#show ip bgp 100.100.100.0/24
BGP routing table entry for 100.100.100.0/24, version 18
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
6.6.6.6 (metric 21) from 2.2.2.2 (2.2.2.2)
Origin IGP, metric 0, localpref 200, valid, internal, best
Originator: 6.6.6.6, Cluster list: 2.2.2.2

A path without LOCAL_PREF is considered to have the value that is set with the bgp
default local-preference command, or if this is not configured, a 100 by default.

3.- PATH LOCALLY ORIGINATED

This point is reached if all of the above attributes have the same value for all the feasible
paths.

Local paths that are sourced by the network or redistribute commands are preferred over
local aggregates that are sourced by the aggregate-address command.

Let’s get back to the original scenario.


Now R5 is announcing the prefix 100.100.100.0/30 to R3 using an iBGP session and R3
generates the bgp aggregated prefix 100.100.100.0/24 using the aggregate-address command,
and also through the redistribution of its Loopback100 interface:

R3#show ip bgp
BGP table version is 4, local router ID is 3.3.3.3
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
s>i100.100.100.0/30 5.5.5.5 0 100 0 i
* 100.100.100.0/24 0.0.0.0 32768 i
*> 0.0.0.0 0 32768 ?

R3#sh ip bgp 100.100.100.0/24


BGP routing table entry for 100.100.100.0/24, version 3
Paths: (2 available, best #2, table default)
Advertised to update-groups:
16 17
Local, (aggregated by 65001 3.3.3.3)
0.0.0.0 from 0.0.0.0 (3.3.3.3)
Origin IGP, localpref 100, weight 32768, valid, aggregated, local,
atomic-aggregate
Local
0.0.0.0 from 0.0.0.0 (3.3.3.3)
Origin incomplete, metric 0, localpref 100, weight 32768, valid,
sourced, best

R3 prefers the path originated via the redistribute command, instead of the one from the
aggregate command. And that path is the one announced to R2.

4.- PATH WITH SHORTEST AS_PATH

If none of the above attributes break the tie and the router doesn’t have the prefix locally
generated, the next parameter to check is the AS_PATH attribute.

The AS_PATH is a well-known mandatory attribute. It means every prefix has this attribute
attached, and every router must understand this attribute. The shorter this attribute is, the
more preferable is the path.

Let’s get back again to the original scenario, with all already seen attributes set by default.

In this scenario, the prefix received from R4 has the longest AS_PATH because it’s an eBGP
session.

R2#sh ip bgp
BGP table version is 61, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*>i100.100.100.0/24 6.6.6.6 0 100 0 i
* 4.4.4.4 0 0 65002 i>/pre>

That’s why R2 prefers the iBGP prefix than the eBGP prefix.

The manipulation of the AS_PATH attribute must be done in a eBGP session. Among
iBGP peers is not possible to manipulate the AS_PATH (you could hide it with the
aggregate-address command, or to manipulate it with confederations)

5.- PATH WITH LOWEST ORIGIN

Origin is also a well-known mandatory attribute, like next-hop and as_path. So every BGP
prefix has this attribute.

There are 3 origin types: IGP, EGP and INCOMPLETE.

IGP is more preferable than Exterior Gateway Protocol (EGP), and EGP is more preferable
than INCOMPLETE.

Typically, when a prefix is generated by the command network, it gets the type IGP, and
when it’s redistributed from another protocol, it gets the type INCOMPLETE.

In our scenario, R6 is generating the prefix 100.100.100.0/24 by redistributing it


Loopback100 interface:

R6#show route-map
route-map CONN, permit, sequence 10
Match clauses:
interface Loopback100
Set clauses:
Policy routing matches: 0 packets, 0 bytes
R6#conf term
R6(config)#router bgp 65001
R6(config-router)#redistribute connected route-map CONN
R6(config-router)#end
R6#clear ip bgp
R2#sh ip bgp 100.100.100.0/24
BGP routing table entry for 100.100.100.0/24, version 76
Paths: (3 available, best #1, table default)
Advertised to update-groups:
13 18
Local, (Received from a RR-client)
3.3.3.3 (metric 11) from 3.3.3.3 (3.3.3.3)
Origin IGP, metric 0, localpref 100, valid, internal, best
Local, (Received from a RR-client)
6.6.6.6 (metric 11) from 6.6.6.6 (6.6.6.6)
Origin incomplete, metric 0, localpref 100, valid, internal
65002
4.4.4.4 (metric 11) from 4.4.4.4 (4.4.4.4)
Origin IGP, metric 0, localpref 100, valid, external

R2 prefers the path through R3 because of the origin type.


In order to change the origin type, a route-map must be used:

R6#conf term
Enter configuration commands, one per line. End with CNTL/Z.
R6(config)#route-map CONN
R6(config-route-map)#set origin igp
R6(config-route-map)#end
R6# clear ip bgp 2.2.2.2
R2#sh ip bgp 100.100.100.0/24
BGP routing table entry for 100.100.100.0/24, version 76
Paths: (3 available, best #1, table default)
Advertised to update-groups:
13 18
Local, (Received from a RR-client)
6.6.6.6 (metric 11) from 6.6.6.6 (6.6.6.6)
Origin IGP, metric 0, localpref 100, valid, internal, best
Local, (Received from a RR-client)
3.3.3.3 (metric 11) from 3.3.3.3 (3.3.3.3)
Origin IGP, metric 0, localpref 100, valid, internal
65002
4.4.4.4 (metric 11) from 4.4.4.4 (4.4.4.4)
Origin IGP, metric 0, localpref 100, valid, external

6.- PATH WITH THE LOWEST MED

MED comparison only occurs if the first (the neighboring) AS is the same in the two paths to
compare. There are other implications (check this Cisco reference to know more about this
parameter)

It’s an Optional Non-transitive Attribute, so it may not been passed to other AS’s and its
usage as a tie-breaker between several paths depends on each AS policy. The lowest MED is
the most preferable.

MED can be manipulated using a route-map:

R3#conf term
R3(config)#route-map MED
R3(config-route-map)#set metric 20000
R3(config-route-map)#router bgp 65001
R3(config-router)#neig 2.2.2.2 route-map MED out
R3(config-router)#end
R3#clear ip bgp 2.2.2.2
R6#conf term
R6(config)#route-map MED
R6(config-route-map)#set metric 1000
R6(config-route-map)#exit
R6(config)#router bgp 65001
R6(config-router)#neig 2.2.2.2 route-map MED out
R6(config-router)#end
R6#clear ip bgp 2.2.2.2
R2#sh ip bgp
BGP table version is 81, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
* i100.100.100.0/24 3.3.3.3 2000 100 0 i
*>i 6.6.6.6 1000 100 0 i
* 4.4.4.4 0 0 65002 i
7.- PREFER EBGP OVER IBGP

We reached the most interesting point.. From the first part of the post, we saw that the path
through R6, who it’s an iBGP peer, was preferred over the path through R4, who is an eBGP
peer.

This is because the fact that the route is learned via iBGP or eBGP is not considered until all
the above attributes are equal.  In that case, the prefix learned through an eBGP session is
preferred over an iBGP session.

In order to try this, I have changed a little bit the scenario. Now R5 keeps an eBGP session
with R3, and it announces the prefix 100.100.100.0/24.

R4 has an eBGP session with R2, and it announces also the prefix 100.100.100.0/24.
Between R2 and R3 there is an iBGP session, but R2 filters everything towards R3.

In this situation, we see that R2 gets two path for the prefix 100.100.100.0/24. Both paths
have the same attributes, but one of them is through an iBGP peer, and the other one through
an eBGP peer:

R2#sh ip bgp
BGP table version is 84, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
* i100.100.100.0/24 5.5.5.5 0 100 0 65003 i
*> 4.4.4.4 0 0 65002 i

R2#sh ip bgp 100.100.100.0/24


BGP routing table entry for 100.100.100.0/24, version 84
Paths: (2 available, best #2, table default)
Advertised to update-groups:
13
65003, (Received from a RR-client)
5.5.5.5 (metric 21) from 3.3.3.3 (3.3.3.3)
Origin IGP, metric 0, localpref 100, valid, internal
65002
4.4.4.4 (metric 11) from 4.4.4.4 (4.4.4.4)
Origin IGP, metric 0, localpref 100, valid, external, best

R2 prefers the path through the eBGP peer, although it has another path through an iBGP
peer.

8.- PATH WITH LOWEST IGP METRIC

If all the above attributes are  equal and no path has been chosen yet, the next parameter to
check is the IGP cost to reach the different next-hops of the prefix.

Getting back to the original scenario, I changed the OSPF cost of R3′s loopback. Now only
R6 and R3 are announcing the prefix 100.100.100.0/24:

R2#sh ip bgp
BGP table version is 88, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
* i100.100.100.0/24 3.3.3.3 0 100 0 i
*>i 6.6.6.6 0 100 0 i

R2#sh ip bgp 100.100.100.0/24


BGP routing table entry for 100.100.100.0/24, version 88
Paths: (2 available, best #2, table default)
Advertised to update-groups:
13
Local, (Received from a RR-client)
3.3.3.3 (metric 1010) from 3.3.3.3 (3.3.3.3)
Origin IGP, metric 0, localpref 100, valid, internal
Local, (Received from a RR-client)
6.6.6.6 (metric 11) from 6.6.6.6 (6.6.6.6)
Origin IGP, metric 0, localpref 100, valid, internal, best

R2#sh ip route 3.3.3.3


Routing entry for 3.3.3.3/32
Known via "ospf 1", distance 110, metric 1010, type intra area
Last update from 10.10.23.3 on Ethernet0/2, 00:00:47 ago
Routing Descriptor Blocks:
* 10.10.23.3, from 3.3.3.3, 00:00:47 ago, via Ethernet0/2
Route metric is 1010, traffic share count is 1

R2#sh ip route 6.6.6.6


Routing entry for 6.6.6.6/32
Known via "ospf 1", distance 110, metric 11, type intra area
Last update from 10.10.26.6 on Ethernet0/3, 05:23:31 ago
Routing Descriptor Blocks:
* 10.10.26.6, from 6.6.6.6, 05:23:31 ago, via Ethernet0/3
Route metric is 11, traffic share count is 1
R2 prefers the path through R6 because the OSPF metric to reach that next-hop is smaller, all
the other parameters are exactly the same for both paths.

And that’s all for now.

You might also like