0% found this document useful (0 votes)
40 views106 pages

Ch4-Network Layer

Uploaded by

janabadareen34
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views106 pages

Ch4-Network Layer

Uploaded by

janabadareen34
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 106

Chapter 4

Network
Layer

Computer Networking: A
Top-Down Approach
8th edition
Jim Kurose, Keith Ross
Pearson, 2020
Network Layer: Internet
host, router network layer functions:

transport layer: TCP, UDP

Path-selection
IP protocol
• datagram format
algorithms: • addressing
network implemented in • packet handling conventions
• routing protocols forwarding
layer (OSPF, BGP) table ICMP protocol
• SDN controller • error reporting
• router “signaling”

link layer

physical layer

Network Layer: 4-2


IP Datagram format
32 bits
IP protocol version number total datagram
ver head. type of length length (bytes)
header length(bytes) len service
fragment fragmentation/
“type” of service: 16-bit identifier flgs
 diffserv (0:5) offset reassembly
 ECN (6:7)
time to upper header
live layer checksum header checksum
TTL: remaining max hops source IP address 32-bit source IP address
(decremented at each router)
Maximum length: 64K bytes
destination IP address 32-bit destination IP address
upper layer protocol (e.g., TCP or UDP) Typically: 1500 bytes or less
options (if any) e.g., timestamp, record
overhead route taken
 20 bytes of TCP payload data
 20 bytes of IP (variable length,
 = 40 bytes + app typically a TCP
layer overhead for or UDP segment)
TCP+IP
Network Layer: 4-3
IP addressing: introduction
223.1.1.1

• IP address: 32-bit identifier 223.1.2.1


associated with each host or router 223.1.1.2
interface 223.1.1.4 223.1.2.9

• interface: connection between 223.1.1.3


223.1.3.27

host/router and physical link 223.1.2.2

• router’s typically have multiple


interfaces 223.1.3.1 223.1.3.2

• host typically has one or two


interfaces (e.g., wired Ethernet,
wireless 802.11) dotted-decimal IP address notation:
223.1.1.1 = 11011111 00000001 00000001 00000001

223 1 1 1
Network Layer: 4-4
IP addressing: introduction
223.1.1.1

• IP address: 32-bit identifier 223.1.2.1


associated with each host or router 223.1.1.2
interface 223.1.1.4 223.1.2.9

• interface: connection between 223.1.1.3


223.1.3.27

host/router and physical link 223.1.2.2

• router’s typically have multiple


interfaces 223.1.3.1 223.1.3.2

• host typically has one or two


interfaces (e.g., wired Ethernet,
wireless 802.11) dotted-decimal IP address notation:
223.1.1.1 = 11011111 00000001 00000001 00000001

223 1 1 1
Network Layer: 4-5
IP addressing: introduction
223.1.1.1

Q: how are interfaces 223.1.2.1

actually connected? 223.1.1.2

A: we’ll learn about A: wired


223.1.1.4 223.1.2.9

that in chapters 6, 7 Ethernet interfaces


connected by 223.1.1.3
223.1.3.27
223.1.2.2
Ethernet switches

223.1.3.1 223.1.3.2

For now: don’t need to worry


about how one interface is
connected to another (with no
intervening router) A: wireless WiFi interfaces
connected by WiFi base station

Network Layer: 4-6


Subnets
223.1.1.1

 What’s a subnet ? 223.1.2.1

• device interfaces that can 223.1.1.2


223.1.1.4 223.1.2.9
physically reach each other
without passing through an 223.1.1.3
223.1.3.27

intervening router 223.1.2.2

 IP addresses have structure:


• subnet part: devices in same subnet 223.1.3.1 223.1.3.2

have common high order bits


• host part: remaining low order bits network consisting of 3 subnets

Network Layer: 4-7


Subnets subnet 223.1.1.0/24
223.1.1.1 subnet 223.1.2.0/24

Recipe for defining subnets: 223.1.2.1

 detach each interface from its 223.1.1.2


223.1.1.4 223.1.2.9

host or router, creating


“islands” of isolated networks 223.1.1.3
223.1.3.27
223.1.2.2

 each isolated network is


subnet
called a subnet 223.1.3.0/24 223.1.3.1 223.1.3.2

subnet mask: /24


(high-order 24 bits: subnet part of IP address)

Network Layer: 4-8


Subnets 223.1.1.2

subnet 223.1.1/24

 where are the 223.1.1.1


223.1.1.4

subnets? 223.1.1.3

 what are
223.1.9.2 223.1.7.0
the /24 subnet 223.1.9/24
subnet 223.1.7/24

subnet
addresses? 223.1.9.1 223.1.7.1
223.1.8.1 223.1.8.0

subnet 223.1.2/24 223.1.2.6 subnet 223.1.8/24 223.1.3.27


subnet 223.1.3/24
223.1.2.1 223.1.2.2 223.1.3.1 223.1.3.2

Network Layer: 4-9


IP addressing: CIDR
CIDR: Classless InterDomain Routing (pronounced “cider”)
• subnet portion of address of arbitrary length
• address format: a.b.c.d/x, where x is # bits in subnet portion
of address
subnet host
part part
11001000 00010111 00010000 00000000
200.23.16.0/23

Network Layer: 4-10


IP addresses: how to get one?
That’s actually two questions:
1. Q: How does a host get IP address within its network (host part of
address)?
2. Q: How does a network get IP address for itself (network part of
address)

How does host get IP address?


 hard-coded by sysadmin in config file (e.g., /etc/rc.config in UNIX)
 DHCP: Dynamic Host Configuration Protocol: dynamically get address
from as server
• “plug-and-play”
Network Layer: 4-11
DHCP: Dynamic Host Configuration
Protocol
goal: host dynamically obtains IP address from network server when it “joins”
network
 can renew its lease on address in use
 allows reuse of addresses (only hold address while
connected/on)
 support for mobile users who join/leave network

DHCP overview:
 host broadcasts DHCP discover msg [optional]
 DHCP server responds with DHCP offer msg [optional]
 host requests IP address: DHCP request msg
 DHCP server sends address: DHCP ack msg
Network Layer: 4-12
DHCP client-server scenario
Typically, DHCP server will be co-
DHCP server located in router, serving all subnets
223.1.1.1
223.1.2.1
to which router is attached

223.1.2.5
223.1.1.2
223.1.1.4 223.1.2.9

223.1.1.3
223.1.3.27 arriving DHCP client needs
223.1.2.2 address in this network

223.1.3.1 223.1.3.2

Network Layer: 4-13


DHCP client-server scenario
DHCP server: 223.1.2.5 DHCP discover Arriving client
src : 0.0.0.0, 68
Broadcast: is there a
dest.: 255.255.255.255,67
DHCP server
yiaddr: 0.0.0.0out
transaction
there?ID: 654

DHCP offer
src: 223.1.2.5, 67
Broadcast: I’m a DHCP
dest: 255.255.255.255, 68
server! Here’s an IP
yiaddr: 223.1.2.4
transaction ID: 654
address you can use
lifetime: 3600 secs
The two steps above can
DHCP request be skipped “if a client
src: 0.0.0.0, 68 remembers and wishes to
dest:: 255.255.255.255,
Broadcast: OK. I would67 reuse a previously
yiaddr: 223.1.2.4 allocated network address”
like to useID:
transaction this
655 IP
[RFC 2131]
address!
lifetime: 3600 secs

DHCP ACK
src: 223.1.2.5, 67
dest: 255.255.255.255, 68
Broadcast: OK. You’ve
yiaddr: 223.1.2.4
gottransaction
that IPID: address!
655
lifetime: 3600 secs
Network Layer: 4-14
DHCP: more than IP addresses
DHCP can return more than just allocated IP address on
subnet:
 address of first-hop router for client
 name and IP address of DNS sever
 network mask (indicating network versus host portion of address)

Network Layer: 4-15


DHCP: example
DHCP DHCP  Connecting laptop will use DHCP
UDP
to get IP address, address of first-
DHCP
DHCP IP
DHCP Eth hop router, address of DNS server.
Phy
DHCP
 DHCP REQUEST message encapsulated
in UDP, encapsulated in IP, encapsulated
DHCP DHCP 168.1.1.1 in Ethernet
DHCP UDP
IP
DHCP

Eth
 Ethernet frame broadcast (dest:
DHCP router with DHCP
Phy server built into FFFFFFFFFFFF) on LAN, received at router
router running DHCP server

 Ethernet de-mux’ed to IP de-mux’ed,


UDP de-mux’ed to DHCP
Network Layer: 4-16
DHCP: example
DHCP DHCP  DHCP server formulates DHCP ACK
DHCP UDP containing client’s IP address, IP
DHCP

DHCP
IP
Eth
address of first-hop router for client,
Phy name & IP address of DNS server

 encapsulated DHCP server reply


DHCP DHCP forwarded to client, de-muxing up to
UDP
DHCP
DHCP IP
DHCP at client
DHCP Eth router with DHCP
DHCP
Phy server built into  client now knows its IP address, name
router and IP address of DNS server, IP
address of its first-hop router

Network Layer: 4-17


IP addresses: how to get one?
Q: how does network get subnet part of IP address?
A: gets allocated portion of its provider ISP’s address space
ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20

ISP can then allocate out its address space in 8 blocks:


Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23
Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23
Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23
... ….. …. ….
Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23

Network Layer: 4-18


Hierarchical addressing: route aggregation
hierarchical addressing allows efficient advertisement of
routing information:
Organization 0
200.23.16.0/23
Organization 1
“Send me anything
200.23.18.0/23 with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
beginning
199.31.0.0/16”

Network Layer: 4-19


Hierarchical addressing: more specific routes
 Organization 1 moves from Fly-By-Night-ISP to ISPs-R-Us
 ISPs-R-Us now advertises a more specific route to Organization 1
Organization 0
200.23.16.0/23
Organization 1
“Send me anything
200.23.18.0/23 with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
Organization 1 beginning
199.31.0.0/16”
200.23.18.0/23 “or 200.23.18.0/23”

Network Layer: 4-20


Hierarchical addressing: more specific routes
 Organization 1 moves from Fly-By-Night-ISP to ISPs-R-Us
 ISPs-R-Us now advertises a more specific route to Organization 1
Organization 0
200.23.16.0/23

“Send me anything
with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
Organization 1 beginning
199.31.0.0/16”
200.23.18.0/23 “or 200.23.18.0/23”

Network Layer: 4-21


IP addressing: last words ...
Q: how does an ISP get block of Q: are there enough 32-bit IP
addresses? addresses?
A: ICANN: Internet Corporation for  ICANN allocated last chunk of
Assigned Names and Numbers IPv4 addresses to RRs in 2011
http://www.icann.org/  NAT (next) helps IPv4 address
• allocates IP addresses, through 5 space exhaustion
regional registries (RRs) (who may
then allocate to local registries)
 IPv6 has 128-bit address space
• manages DNS root zone, including
delegation of individual TLD (.com, "Who the hell knew how much address
.edu , …) management space we needed?" Vint Cerf (reflecting
on decision to make IPv4 address 32 bits
long)
Network Layer: 4-22
NAT: network address translation
 all devices in local network have 32-bit addresses in a “private” IP
address space (10/8, 172.16/12, 192.168/16 prefixes) that can only
be used in local network
 advantages:
 just one IP address needed from provider ISP for all devices
 can change addresses of host in local network without notifying
outside world
 can change ISP without changing addresses of devices in local
network
 security: devices inside local net not directly addressable, visible
by outside world

Network Layer: 4-23


NAT: network address translation
implementation: NAT router must (transparently):
 outgoing datagrams: replace (source IP address, port #) of every
outgoing datagram to (NAT IP address, new port #)
• remote clients/servers will respond using (NAT IP address, new port
#) as destination address
 remember (in NAT translation table) every (source IP address, port #)
to (NAT IP address, new port #) translation pair
 incoming datagrams: replace (NAT IP address, new port #) in
destination fields of every incoming datagram with corresponding
(source IP address, port #) stored in NAT table
Network Layer: 4-24
NAT: network address translation
NAT translation table
2: NAT router changes 1: host 10.0.0.1 sends
WAN side addr LAN side addr datagram to
datagram source address
from 10.0.0.1, 3345 to 138.76.29.7, 5001 10.0.0.1, 3345 128.119.40.186, 80
138.76.29.7, 5001, …… ……
updates table
S: 10.0.0.1, 3345
D: 128.119.40.186, 80
10.0.0.1
1
S: 138.76.29.7, 5001
2 D: 128.119.40.186, 80 10.0.0.4
10.0.0.2
138.76.29.7 S: 128.119.40.186, 80
D: 10.0.0.1, 3345
4
S: 128.119.40.186, 80 10.0.0.3
D: 138.76.29.7, 5001 3
3: reply arrives, destination
address: 138.76.29.7, 5001

Network Layer: 4-25


NAT: network address translation
 NAT has been controversial:
• routers “should” only process up to layer 3
• address “shortage” should be solved by IPv6
• violates end-to-end argument (port # manipulation by network-layer device)
• NAT traversal: what if client wants to connect to server behind NAT?
 but NAT is here to stay:
• extensively used in home and institutional nets, 4G/5G cellular nets

Network Layer: 4-26


IPv6: motivation
 initial motivation: 32-bit IPv4 address space would be
completely allocated
 additional motivation:
• speed processing/forwarding: 40-byte fixed length header
• enable different network-layer treatment of “flows”

Network Layer: 4-27


IPv6 datagram format
flow label: identify
priority: identify
32 bits datagrams in same
priority among ver pri flow label "flow.” (concept of
datagrams in flow
payload len next hdr hop limit “flow” not well defined).
source address
128-bit (128 bits)
IPv6 addresses destination address
(128 bits)

payload (data)

What’s missing (compared with IPv4):


 no checksum (to speed processing at routers)
 no fragmentation/reassembly
 no options (available as upper-layer, next-header protocol at router)
Network Layer: 4-28
Transition from IPv4 to IPv6
• not all routers can be upgraded simultaneously
• no “flag days”
• how will network operate with mixed IPv4 and IPv6 routers?
 tunneling: IPv6 datagram carried as payload in IPv4 datagram among
IPv4 routers (“packet within a packet”)
• tunneling used extensively in other contexts (4G/5G)

IPv4 header fields IPv6 header fields


IPv4 payload
IPv4 source, dest addr IPv6 source dest addr
UDP/TCP payload

IPv6 datagram
IPv4 datagram
Network Layer: 4-29
Tunneling and encapsulation
A B Ethernet connects two E F
Ethernet connecting IPv6 routers
two IPv6 routers: IPv6 IPv6 IPv6 IPv6

IPv6 datagram
Link-layer frame The usual: datagram as payload in link-layer frame

IPv4 network A B E F
connecting two
IPv6 routers IPv6 IPv6/v4 IPv6/v4 IPv6

IPv4 network

Network Layer: 4-30


Tunneling and encapsulation
A B Ethernet connects two E F
Ethernet connecting IPv6 routers
two IPv6 routers: IPv6 IPv6 IPv6 IPv6

IPv6 datagram
Link-layer frame The usual: datagram as payload in link-layer frame

IPv4 tunnel A B IPv4 tunnel E F


connecting IPv6 routers
connecting two
IPv6 routers IPv6 IPv6/v4 IPv6/v4 IPv6

IPv6 datagram
IPv4 datagram tunneling: IPv6 datagram as payload in a IPv4 datagram
Network Layer: 4-31
Tunneling
A B IPv4 tunnel E F
connecting IPv6 routers
logical view:
IPv6 IPv6/v4 IPv6/v4 IPv6

A B C D E F
physical view:
IPv6 IPv6/v4 IPv4 IPv4 IPv6/v4 IPv6

flow: X src:B src:B src:B flow: X


src: A dest: E dest: E src: A
dest: F
dest: E
dest: F
Flow: X Flow: X Flow: X
Src: A Src: A Src: A
Note source and data Dest: F Dest: F Dest: F data
destination
addresses! data data data

A-to-B: E-to-F:
B-to-C: B-to-C: B-to-C:
IPv6 IPv6
IPv6 inside IPv6 inside IPv6 inside
IPv4 IPv4 IPv4
Network Layer: 4-32
IPv6: adoption
• Google1: ~ 40% of clients access services via IPv6 (2023)
• NIST: 1/3 of all US government domains are IPv6 capable

Network Layer: 4-33


IPv6: adoption
 Google1: ~ 40% of clients access services via IPv6 (2023)
• NIST: 1/3 of all US government domains are IPv6 capable
• Long (long!) time for deployment, use
• 25 years and counting!
• think of application-level changes in last 25 years: WWW, social
media, streaming media, gaming, telepresence, …
• Why?

1
https://www.google.com/intl/en/ipv6/statistics.html
Network Layer: 4-34
Network-layer functions
 forwarding: move packets from router’s
input to appropriate router output data plane
 routing: determine route taken by
packets from source to destination control plane

Two approaches to structuring network control plane:


• per-router control (traditional)
• logically centralized control (software defined networking)

Network Layer: 5-35


Per-router control plane
Individual routing algorithm components in each and every
router interact in the control plane

Routing
Algorithm
control
plane

data
plane

values in arriving
packet header
0111 1
2
3

Network Layer: 4-36


Software-Defined Networking (SDN) control plane
Remote controller computes, installs forwarding tables in routers

Remote Controller

control
plane

data
plane

CA
CA CA CA CA
values in arriving
packet header

0111 1
2
3

Network Layer: 4-37


Per-router SDN control
control plane plane
Network layer: “control plane” roadmap
 introduction
 routing protocols
 link state
 distance vector
 intra-ISP routing: OSPF
 routing among ISPs: BGP  network management,
 SDN control plane configuration
 Internet Control Message • SNMP
Protocol • NETCONF/YANG

Network Layer: 4-39


Routing protocols mobile network
national or global ISP
Routing protocol goal: determine “good”
paths (equivalently, routes), from sending
hosts to receiving host, through network
application
transport
network
of routers link
physical
network network

• path: sequence of routers packets link


physical
link
physical

traverse from given initial source host to network

final destination host link


physical
network
link
physical network
datacenter
• “good”: least “cost”, “fastest”, “least
link
physical network

congested” application
transport
• routing: a “top-10” networking enterprise
network
link

challenge! network physical

Network Layer: 5-40


Graph abstraction: link
costs
5
ca,b: cost of direct link connecting a and b
3
v w 5 e.g., cw,z = 5, cu,z = ∞
2
u 2 1 z
3
1 cost defined by network operator:
2
x 1
y could always be 1, or inversely related
to bandwidth, or inversely related to
congestion
graph: G = (N,E)
N: set of routers = { u, v, w, x, y, z }
E: set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }

Network Layer: 5-41


Routing algorithm
classification
global: all routers have complete
topology, link cost info
• “link state” algorithms
How fast
dynamic: routes change
do routes static: routes change more quickly
change? slowly over time • periodic updates or in
response to link cost
changes
decentralized: iterative process of
computation, exchange of info with neighbors
• routers initially only know link costs to
attached neighbors
• “distance vector” algorithms
global or decentralized information? Network Layer: 5-42
Network layer: “control plane” roadmap
 introduction
 routing protocols
 link state
 distance vector
 intra-ISP routing: OSPF
 routing among ISPs: BGP  network management,
 SDN control plane configuration
 Internet Control Message • SNMP
Protocol • NETCONF/YANG

Network Layer: 5-43


Dijkstra’s link-state routing algorithm
 centralized: network topology, link notation
costs known to all nodes
 cx,y: direct link cost from node
• accomplished via “link state
broadcast” x to y; = ∞ if not direct
neighbors
• all nodes have same info
 D(v): current estimate of cost
 computes least cost paths from one of least-cost-path from source
node (“source”) to all other nodes to destination v
• gives forwarding table for that node  p(v): predecessor node along
 iterative: after k iterations, know path from source to v
 N': set of nodes whose least-
least cost path to k destinations cost-path definitively known

Network Layer: 5-44


Dijkstra’s link-state routing algorithm
1 Initialization:
2 N' = {u} /* compute least cost path from u to all other nodes */
3 for all nodes v
4 if v adjacent to u /* u initially knows direct-path-cost only to direct neighbors */
5 then D(v) = cu,v /* but may not be minimum cost! */
6 else D(v) = ∞
7
8 Loop
9 find w not in N' such that D(w) is a minimum
10 add w to N'
11 update D(v) for all v adjacent to w and not in N' :
12 D(v) = min ( D(v), D(w) + cw,v )
13 /* new least-path-cost to v is either old least-cost-path to v or known
14 least-cost-path to w plus direct-cost from w to v */
15 until all nodes in N'
Network Layer: 5-45
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1
2
3
4
5

5 Initialization (step 0):


For all a: if a adjacent to u then D(a) = cu,a
3
v w 5
2
u 2 1 z
3
1 2
x 1
y
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux
2
3
4
5
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
2 5
u 2 1 z
3
1 2
x 1
y
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2
3
4
5
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
5
2 11 update D(b) for all b adjacent to a and not in N' :
u 2 1 z D(b) = min ( D(b), D(a) + ca,b )
3
1 2 D(v) = min ( D(v), D(x) + cx,v ) = min(2, 1+2) = 2
x y
1 D(w) = min ( D(w), D(x) + cx,w ) = min (5, 1+3) = 4
D(y) = min ( D(y), D(x) + cx,y ) = min(inf,1+1) = 2
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy
3
4
5
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
2 5
u 2 1 z
3
1 2
x 1
y
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3
4
5
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
5
2 11 update D(b) for all b adjacent to a and not in N' :
u 2 1 z D(b) = min ( D(b), D(a) + ca,b )
3
1 2
x y D(w) = min ( D(w), D(y) + cy,w ) = min (4, 2+1) = 3
1 D(z) = min ( D(z), D(y) + cy,z ) = min(inf,2+2) = 4
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv
4
5
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
2 5
u 2 1 z
3
1 2
x 1
y
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv 3,y 4,y
4
5
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
5
2 11 update D(b) for all b adjacent to a and not in N' :
u 2 1 z D(b) = min ( D(b), D(a) + ca,b )
3
1 2
x y D(w) = min ( D(w), D(v) + cv,w ) = min (3, 2+3) = 3
1
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv 3,y 4,y
4 uxyvw
5
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
2 5
u 2 1 z
3
1 2
x 1
y
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv 3,y 4,y
4 uxyvw 4,y
5
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
5
2 11 update D(b) for all b adjacent to a and not in N' :
u 2 1 z D(b) = min ( D(b), D(a) + ca,b )
3
1 2
x y D(z) = min ( D(z), D(w) + cw,z ) = min (4, 3+5) = 4
1
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv 3,y 4,y
4 uxyvw 4,y
5 uxyvwz
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
2 5
u 2 1 z
3
1 2
x 1
y
Dijkstra’s algorithm: an example
v w x y z
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv 3,y 4,y
4 uxyvw 4,y
5 uxyvwz
8 Loop
5 9 find a not in N' such that D(a) is a minimum
v 3
w 10 add a to N'
5
2 11 update D(b) for all b adjacent to a and not in N' :
u 2 1 z D(b) = min ( D(b), D(a) + ca,b )
3
1 2
x 1
y
Dijkstra’s algorithm: an example
5
3
v w 5
2
u 2 1 z
3
1 2
x 1
y

resulting least-cost-path tree from u: resulting forwarding table in u:


destination outgoing link
v w
v (u,v) route from u to v directly
u z x (u,x)
y (u,x) route from u to all
x y w (u,x) other destinations
z (u,x) via x
Network Layer: 5-58
Dijkstra’s algorithm: another example
v w x y z
D(v), D(w), D(x), D(y), D(z), x
9
Step N' p(v) p(w) p(x) p(y) p(z)

0 u 7,u 3,u 5,u ∞ ∞ 5 7


4
1 uw 6,w 5,u 11,w ∞ 8
2 uwx 6,w 11,w 14,x 3 w z
u y
2
3 uwxv 10,v 14,x
3
4 uwxvy 12,y 7 4

5 uwxvyz v

notes:
 construct least-cost-path tree by tracing predecessor nodes
 ties can exist (can be broken arbitrarily)
Network Layer: 5-59
Dijkstra’s algorithm: discussion
algorithm complexity: n nodes
 each of n iteration: need to check all nodes, w, not in N
 n(n+1)/2 comparisons: O(n2) complexity
 more efficient implementations possible: O(nlogn)
message complexity:
 each router must broadcast its link state information to other n routers
 efficient (and interesting!) broadcast algorithms: O(n) link crossings to disseminate a
broadcast message from one source
 each router’s message crosses O(n) links: overall message complexity: O(n2)

Network Layer: 5-60


Dijkstra’s algorithm: oscillations possible
 when link costs depend on traffic volume, route oscillations possible
 sample scenario:
• routing to destination a, traffic entering at d, c, e with rates 1, e (<1), 1
• link costs are directional, and volume-dependent

a 2+e
a 0
a 2+e a
1 1+e 0 2+e 0
d b d 1+e 1 b d 0 0 b d 1+e 1 b
0 0
e 1 0 1
1 0
c 0 1 1
c 1+e 1 1 0 0 1
c 1 c
e e e
e
given these costs, given these costs, given these costs,
initially find new routing…. find new routing…. find new routing….
resulting in new costs resulting in new costs resulting in new costs

Network Layer: 5-61


Network layer: “control plane” roadmap
 introduction
 routing protocols
 link state
 distance vector
 intra-ISP routing: OSPF
 routing among ISPs: BGP  network management,
 SDN control plane configuration
 Internet Control Message • SNMP
Protocol • NETCONF/YANG

Network Layer: 5-62


Distance vector algorithm
Based on Bellman-Ford (BF) equation (dynamic programming):
Bellman-Ford equation

Let Dx(y): cost of least-cost path from x to y.


Then:
Dx(y) = minv { cx,v + Dv(y) }

v’s estimated least-cost-path cost to y


min taken over all neighbors v of x direct cost of link from x to v
Network Layer: 5-63
Bellman-Ford Example
Suppose that u’s neighboring nodes, x,v,w, know that for destination z:
Dv(z) = 5 Dw(z) = 3 Bellman-Ford equation says:
5
Du(z) = min { cu,v + Dv(z),
3 w
v 5 cu,x + Dx(z),
2
u 2 1 z cu,w + Dw(z) }
3
1 2
= min {2 + 5,
x 1
y 1 + 3,
5 + 3} = 4
Dx(z) = 3
node achieving minimum (x) is
next hop on estimated least-
cost path to destination (z)
Network Layer: 5-64
Distance vector algorithm
key idea:
 from time-to-time, each node sends its own distance vector estimate
to neighbors
 when x receives new DV estimate from any neighbor, it updates its
own DV using B-F equation:
Dx(y) ← minv{cx,v + Dv(y)} for each node y ∊ N

 under minor, natural conditions, the estimate Dx(y) converge to the


actual least cost dx(y)

Network Layer: 5-65


Distance vector algorithm:
each node: iterative, asynchronous: each local
iteration caused by:
 local link cost change
wait for (change in local link
cost or msg from neighbor)  DV update message from neighbor

distributed, self-stopping: each


recompute DV estimates using node notifies neighbors only when
DV received from neighbor its DV changes
 neighbors then notify their
if DV to any destination has neighbors – only if necessary
changed, notify neighbors  no notification received, no
actions taken!

Network Layer: 5-66


Distance vector: example
DV in a:
Da(a)=0
Da(b) = 8
Da(c) = ∞ a b c
8 1
Da(d) = 1

t=0 Da(e) = ∞
Da(f) = ∞ 1 1
Da(g) = ∞
 All nodes have
Da(h) = ∞
distance estimates
Da(i) = ∞ A few asymmetries:
to nearest d e f  missing link
neighbors (only) 1 1
 larger cost
 All nodes send
their local
distance vector to 1 1 1
their neighbors

g h i
1 1

Network Layer: 5-67


Distance vector example: iteration

a b c
8 1

t=1 1 1
All nodes:
 receive distance
vectors from
neighbors d e f
 compute their new 1 1
local distance
vector
 send their new
1 1 1
local distance
vector to neighbors

g h i
1 1

Network Layer: 5-68


Distance vector example: iteration

a
compute compute
b compute
c
8 1

t=1 1 1
All nodes:
 receive distance
vectors from
neighbors d
compute compute
e compute
f
 compute their new 1 1
local distance
vector
 send their new
1 1 1
local distance
vector to neighbors

g
compute h
compute compute
i
1 1

Network Layer: 5-69


Distance vector example: iteration

a b c
8 1

t=1 1 1
All nodes:
 receive distance
vectors from
neighbors d e f
 compute their new 1 1
local distance
vector
 send their new
1 1 1
local distance
vector to neighbors

g h i
1 1

Network Layer: 5-70


Distance vector example: iteration

a b c
8 1

t=2 1 1
All nodes:
 receive distance
vectors from
neighbors d e f
 compute their new 1 1
local distance
vector
 send their new
1 1 1
local distance
vector to neighbors

g h i
1 1

Network Layer: 5-71


Distance vector example: iteration

compute
a compute
b compute
c
2 1

t=2 1 1
All nodes:
 receive distance
vectors from
neighbors d
compute compute
e compute
f
 compute their new 1 1
local distance
vector
 send their new
1 1 1
local distance
vector to neighbors

g
compute compute
h compute
i
8 1

Network Layer: 5-72


Distance vector example: iteration

a b c
8 1

t=2 1 1
All nodes:
 receive distance
vectors from
neighbors d e f
 compute their new 1 1
local distance
vector
 send their new
1 1 1
local distance
vector to neighbors

g h i
1 1

Network Layer: 5-73


Distance vector example: iteration

…. and so on

Let’s next take a look at the iterative computations at nodes

Network Layer: 5-74


Distance vector example:
DV in b: DV in c:

computation
Dc(a) = ∞
Db(a) = 8 Db(f) = ∞
Db(c) = 1 Db(g) = ∞ Dc(b) = 1
DV in a: Db(d) = ∞ Db(h) = ∞ Dc(c) = 0
Da(a)=0 Db(e) = 1 Db(i) = ∞ Dc(d) = ∞
Da(b) = 8 Dc(e) = ∞
Da(c) = ∞ a b c Dc(f) = ∞
8 1
Da(d) = 1 Dc(g) = ∞

t=1 Da(e) = ∞
Da(f) = ∞ 1 1
Dc(h) = ∞
Dc(i) = ∞
 b receives DVs Da(g) = ∞ DV in e:
from a, c, e Da(h) = ∞ De(a) = ∞
Da(i) = ∞ De(b) = 1
d e f De(c) = ∞
1 1
De(d) = 1
De(e) = 0
De(f) = 1
1 1 1
De(g) = ∞
De(h) = 1
De(i) = ∞
g h i
1 1

Network Layer: 5-75


Distance vector example:
DV in b: DV in c:

computation
Dc(a) = ∞
Db(a) = 8 Db(f) = ∞
Db(c) = 1 Db(g) = ∞ Dc(b) = 1
DV in a: Db(d) = ∞ Db(h) = ∞ Dc(c) = 0
Da(a)=0 Db(e) = 1 Db(i) = ∞ Dc(d) = ∞
Da(b) = 8 Dc(e) = ∞
Da(c) = ∞ a b c Dc(f) = ∞
8 compute 1
Da(d) = 1 Dc(g) = ∞

t=1 Da(e) = ∞
Da(f) = ∞ 1 1
Dc(h) = ∞
Dc(i) = ∞
 b receives DVs Da(g) = ∞ DV in e:
from a, c, e, Da(h) = ∞ De(a) = ∞
computes: e
Da(i) = ∞ De(b) = 1
d e f De(c) = ∞
1
Db(a) = min{cb,a+Da(a), cb,c +Dc(a), cb,e+De(a)} = min{8,∞,∞} =8 1
De(d) = 1
Db(c) = min{cb,a+Da(c), cb,c +Dc(c), c b,e +De(c)} = min{∞,1,∞} = 1
De(e) = 0
Db(d) = min{cb,a+Da(d), cb,c +Dc(d), c b,e +De(d)} = min{9,2,∞} = 2 De(f) = 1
1 1 1
Db(e) = min{cb,a+Da(e), cb,c +Dc(e), c b,e +De(e)} = min{∞,∞,1} = 1 De(g) = ∞
Db(f) = min{cb,a+Da(f), cb,c +Dc(f), c b,e +De(f)} = min{∞,∞,2} = 2
DV in b: De(h) = 1
Db(g) = min{cb,a+Da(g), cb,c +Dc(g), c b,e+De(g)} = min{∞, ∞, ∞} = ∞ Db(a) = 8 Db(f) =2 De(i) = ∞
g h 1Db(c) = 1 Db(g)i = ∞
1 ∞, 2} = 2
Db(h) = min{cb,a+Da(h), cb,c +Dc(h), c b,e+De(h)} = min{∞,
Db(d) = 2 Db(h) = 2
Db(i) = min{cb,a+Da(i), cb,c +Dc(i), c b,e+De(i)} = min{∞, ∞, ∞} = ∞ Db(e) = 1 Db(i) = ∞
Network Layer: 5-76
Distance vector example:
DV in b: DV in c:

computation
Dc(a) = ∞
Db(a) = 8 Db(f) = ∞
Db(c) = 1 Db(g) = ∞ Dc(b) = 1
DV in a: Db(d) = ∞ Db(h) = ∞ Dc(c) = 0
Da(a)=0 Db(e) = 1 Db(i) = ∞ Dc(d) = ∞
Da(b) = 8 Dc(e) = ∞
Da(c) = ∞ a b c Dc(f) = ∞
8 1
Da(d) = 1 Dc(g) = ∞

t=1 Da(e) = ∞
Da(f) = ∞ 1 1
Dc(h) = ∞
Dc(i) = ∞
 c receives DVs Da(g) = ∞ DV in e:
from b Da(h) = ∞ De(a) = ∞
Da(i) = ∞ De(b) = 1
d e f De(c) = ∞
1 1
De(d) = 1
De(e) = 0
De(f) = 1
1 1 1
De(g) = ∞
De(h) = 1
De(i) = ∞
g h i
1 1

Network Layer: 5-77


Distance vector example:
DV in b: DV in c:

computation
Dc(a) = ∞
Db(a) = 8 Db(f) = ∞
Db(c) = 1 Db(g) = ∞ Dc(b) = 1
Db(d) = ∞ Db(h) = ∞ Dc(c) = 0
Db(e) = 1 Db(i) = ∞ Dc(d) = ∞
Dc(e) = ∞
a b c
compute Dc(f) = ∞
8 1
Dc(g) = ∞

t=1 1 1
Dc(h) = ∞
Dc(i) = ∞
 c receives DVs
from b computes:

d b(a}} = 1 + 8 = 9
Dc(a) = min{cc,b+D e f
DV in c:
Dc(b) = min{cc,b+Db(b)} = 1 + 0 = 1
Dc(a) = 9
Dc(d) = min{cc,b+Db(d)} = 1+ ∞ = ∞ Dc(b) = 1
Dc(e) = min{cc,b+Db(e)} = 1 + 1 = 2 Dc(c) = 0
Dc(f) = min{cc,b+Db(f)} = 1+ ∞ = ∞ Dc(d) = 2
Dc(g) = min{cc,b+Db(g)} = 1+ ∞ = ∞ Dc(e) = ∞ * Check out the online interactive
Dc(f) = ∞ exercises for more examples:
g b(h)} = 1+ ∞ = ∞
Dc(h) = min{cbc,b+D h i http://gaia.cs.umass.edu/kurose_ross/interactive/
Dc(g) = ∞
Dc(i) = min{cc,b+Db(i)} = 1+ ∞ = ∞
Dc(h) = ∞
Network Layer: 5-78
Dc(i) = ∞
Distance vector example:
DV in b:

computation Db(a) = 8
Db(c) = 1
Db(f) = ∞
Db(g) = ∞
DV in e:
Db(d) = ∞ Db(h) = ∞
DV in d:
Db(e) = 1 Db(i) = ∞ De(a) = ∞
Dc(a) = 1
De(b) = 1
Dc(b) = ∞ a b c De(c) = ∞
Dc(c) = ∞ 8 1
De(d) = 1
Dc(d) = 0
t=1 Dc(e) = 1
1
Q: what is new DV computed in e at
1t=1?
De(e) = 0
De(f) = 1
 e receives DVs Dc(f) = ∞
De(g) = ∞
from b, d, f, h Dc(g) = 1
De(h) = 1
Dc(h) = ∞
De(i) = ∞
Dc(i) = ∞ d compute
e f DV in f:
1 1
DV in h: Dc(a) = ∞
Dc(a) = ∞ Dc(b) = ∞
Dc(b) = ∞ Dc(c) = ∞
Dc(c) = ∞ 1 1 1
Dc(d) = ∞
Dc(d) = ∞ Dc(e) = 1
Dc(e) = 1 Dc(f) = 0
Dc(f) = ∞ g h i Dc(g) = ∞
1 1
Dc(g) = 1 Dc(h) = ∞
Dc(h) = 0 Dc(i) = 1 Network Layer: 5-79
Distance vector: state information
diffusion
Iterative communication, computation steps diffuses information through network:
t=0 c’s state at t=0 is at c only
a b c
8 1
c’s state at t=0 has propagated to b, and
t=1 may influence distance vector computations
up to 1 hop away, i.e., at b 1 1 t=1
t=2
c’s state at t=0 may now influence distance
t=2 vector computations up to 2 hops away, i.e.,
d e f
at b and now at a, e as well 1 1

c’s state at t=0 may influence distance vector


t=3
computations up to 3 hops away, i.e., at d, f, h 1 1 1 t=3

c’s state at t=0 may influence distance vector


t=4 g
computations up to 4 hops away, i.e., at g, i 1
h 1
i
t=4
Distance vector: link cost changes
link cost changes: 1
y
4 1
 node detects local link cost change
x z
 updates routing info, recalculates local DV 50

 if DV changes, notify neighbors

t0 : y detects link-cost change, updates its DV, informs its neighbors.


“good news t1 : z receives update from y, updates its DV, computes new least cost
travels fast”
to x , sends its neighbors its DV.
t2 : y receives z’s update, updates its DV. y’s least costs do not
change, so y does not send a message to z.
Distance vector: link cost changes
link cost changes: 60
y
4 1
 node detects local link cost change
x z
 “bad news travels slow” – count-to-infinity problem: 50

• y sees direct link to x has new cost 60, but z has said it has a path at cost of 5. So
y computes “my new cost to x will be 6, via z); notifies z of new cost of 6 to x.
• z learns that path to x via y has new cost 6, so z computes “my new cost to
x will be 7 via y), notifies y of new cost of 7 to x.
• y learns that path to x via z has new cost 7, so y computes “my new cost to
x will be 8 via y), notifies z of new cost of 8 to x.
• z learns that path to x via y has new cost 8, so z computes “my new cost to
x will be 9 via y), notifies y of new cost of 9 to x.

 see text for solutions. Distributed algorithms are tricky!
Comparison of LS and DV algorithms
message complexity robustness: what happens if router
LS: n routers, O(n2) messages sent malfunctions, or is compromised?
DV: exchange between neighbors; LS:
convergence time varies • router can advertise incorrect link cost
• each router computes only its own
speed of convergence table
LS: O(n2) algorithm, O(n2) messages DV:
• may have oscillations
• DV router can advertise incorrect path
DV: convergence time varies cost (“I have a really low-cost path to
• may have routing loops everywhere”): black-holing
• count-to-infinity problem
• each router’s DV is used by others:
error propagate thru network
Network layer: “control plane” roadmap
 introduction
 routing protocols
 intra-ISP routing: OSPF
 routing among ISPs: BGP
 SDN control plane
 Internet Control Message  network management,
Protocol configuration
• SNMP
• NETCONF/YANG

Network Layer: 5-84


Making routing scalable
our routing study thus far - idealized
• all routers identical
• network “flat”
… not true in practice
scale: billions of destinations: administrative autonomy:
 can’t store all destinations in  Internet: a network of networks
routing tables!  each network admin may want to
 routing table exchange would control routing in its own network
swamp links!

Network Layer: 5-85


Internet approach to scalable
routing
aggregate routers into regions known as “autonomous
systems” (AS) (a.k.a. “domains”)

intra-AS (aka “intra-domain”): inter-AS (aka “inter-domain”):


routing among routers within same routing among AS’es
AS (“network”)  gateways perform inter-domain
 all routers in AS must run same intra- routing (as well as intra-domain
domain protocol routing)
 routers in different AS can run different
intra-domain routing protocols
 gateway router: at “edge” of its own AS,
has link(s) to router(s) in other AS’es
Network Layer: 5-86
Interconnected ASes
forwarding table configured by intra-
and inter-AS routing algorithms
Intra-AS
Routing
Inter-AS
Routing  intra-AS routing determine entries for
forwarding destinations within AS
table
 inter-AS & intra-AS determine entries
for external destinations

intra-AS
3c
routing3a inter-AS routing intra-AS
2c
3b 2a routing
2b
1c
AS3 intra-AS
1a routing 1b AS2
1d
AS1

Network Layer: 5-87


Inter-AS routing: a role in intradomain forwarding
 suppose router in AS1 receives AS1 inter-domain routing must:
datagram destined outside of AS1: 1. learn which destinations reachable
• router should forward packet to through AS2, which through AS3
gateway router in AS1, but which 2. propagate this reachability info to all
one? routers in AS1

3c
3a other
2c
3b 2a networks
2b
1c
AS3
other 1a 1b AS2
networks
1d
AS1

Network Layer: 5-88


Intra-AS routing: routing within an AS
most common intra-AS routing protocols:
 RIP: Routing Information Protocol [RFC 1723]
• classic DV: DVs exchanged every 30 secs
• no longer widely used
 EIGRP: Enhanced Interior Gateway Routing Protocol
• DV based
• formerly Cisco-proprietary for decades (became open in 2013 [RFC 7868])
 OSPF: Open Shortest Path First [RFC 2328]
• link-state routing
• IS-IS protocol (ISO standard, not RFC standard) essentially same as OSPF

Network Layer: 5-89


OSPF (Open Shortest Path First) routing
 “open”: publicly available
 classic link-state
• each router floods OSPF link-state advertisements (directly over IP
rather than using TCP/UDP) to all other routers in entire AS
• multiple link costs metrics possible: bandwidth, delay
• each router has full topology, uses Dijkstra’s algorithm to compute
forwarding table
 security: all OSPF messages authenticated (to prevent malicious
intrusion)

Network Layer: 5-90


Hierarchical OSPF
 two-level hierarchy: local area, backbone.
• link-state advertisements flooded only in area, or backbone
• each node has detailed area topology; only knows direction to reach
other destinations
area border routers: boundary router:
“summarize” distances to connects to other ASes
backbone
destinations in own area, backbone router:
advertise in backbone runs OSPF limited
to backbone
local routers:
• flood LS in area only area 3
• compute routing within
area
• forward packets to outside internal
area 1 routers
via area border router
area 2 Network Layer: 5-91
Network layer: “control plane” roadmap
 introduction
 routing protocols
 intra-ISP routing: OSPF
 routing among ISPs: BGP
 SDN control plane
 Internet Control Message  network management,
Protocol configuration
• SNMP
• NETCONF/YANG

Network Layer: 5-92


Interconnected ASes
intra-AS
3c
routing3a inter-AS routing intra-AS
2c
3b 2a routing
2b
1c
AS3 intra-AS
1a routing 1b AS2
1d
AS1

intra-AS (aka “intra-domain”): routing among routers within same


AS (“network”)
inter-AS (aka “inter-domain”): routing among AS’es
Network Layer: 5-93
Internet inter-AS routing: BGP
 BGP (Border Gateway Protocol): the de facto inter-domain routing
protocol
• “glue that holds the Internet together”
 allows subnet to advertise its existence, and the destinations it can
reach, to rest of Internet: “I am here, here is who I can reach, and how”
 BGP provides each AS a means to:
• obtain destination network reachability info from neighboring ASes
(eBGP)
• determine routes to other networks based on reachability information
and policy
• propagate reachability information to all AS-internal routers (iBGP)
• advertise (to neighboring networks) destination reachability info
Network Layer: 5-94
eBGP, iBGP connections
2b

2a 2c

1b 3b
2d
1a 1c ∂
3a 3c
AS 2
1d 3d

AS 1 eBGP connectivity AS 3
logical iBGP connectivity

1c gateway routers run both eBGP and iBGP protocols

Network Layer: 5-95


BGP basics
 BGP session: two BGP routers (“peers”) exchange BGP messages over
semi-permanent TCP connection:
• advertising paths to different destination network prefixes (BGP is a “path
vector” protocol)
 when AS3 gateway 3a advertises path AS3,X to AS2 gateway 2c:
• AS3 promises to AS2 it will forward datagrams towards X
AS 3 3b
AS 1 1b 3a 3c
1a 1c AS 2 3d
2b
1d BGP advertisement:
2a 2c X
AS3, X
2d
Network Layer: 5-96
BGP protocol messages
 BGP messages exchanged between peers over TCP connection
 BGP messages [RFC 4371]:
• OPEN: opens TCP connection to remote BGP peer and authenticates
sending BGP peer
• UPDATE: advertises new path (or withdraws old)
• KEEPALIVE: keeps connection alive in absence of UPDATES; also ACKs
OPEN request
• NOTIFICATION: reports errors in previous msg; also used to close
connection
Path attributes and BGP routes
 BGP advertised route: prefix + attributes
• prefix: destination being advertised
• two important attributes:
• AS-PATH: list of ASes through which prefix advertisement has passed
• NEXT-HOP: indicates specific internal-AS router to next-hop AS
 policy-based routing:
• gateway receiving route advertisement uses import policy to
accept/decline path (e.g., never route through AS Y).
• AS policy also determines whether to advertise path to other other
neighboring ASes

Network Layer: 5-98


BGP path advertisement
AS 3 3b
AS 1 1b 3a 3c
1a 1c AS 2 3d X
2b
1d AS3, X
AS2,AS3,X 2a 2c

2d

 AS2 router 2c receives path advertisement AS3,X (via eBGP) from AS3 router 3a
 based on AS2 policy, AS2 router 2c accepts path AS3,X, propagates (via iBGP) to all
AS2 routers
 based on AS2 policy, AS2 router 2a advertises (via eBGP) path AS2, AS3, X to
AS1 router 1c
Network Layer: 5-99
BGP path advertisement: multiple
paths AS 3 3b
AS 1 1b AS3,X 3a 3c
AS3,X
AS3,X
1a 1c AS 2 3d X
2b
AS3,X
1d AS3, X
AS2,AS3,X 2a 2c

2d

gateway router may learn about multiple paths to destination:


 AS1 gateway router 1c learns path AS2,AS3,X from 2a
 AS1 gateway router 1c learns path AS3,X from 3a
 based on policy, AS1 gateway router 1c chooses path AS3,X and advertises path
within AS1 via iBGP
Network Layer: 5-100
BGP: populating forwarding tables
AS 3 3b
AS 1 1b AS3,X 3a 3c
AS3,X
1
AS3,X
1a 1c AS 2 3d X
2 2b
local link AS3,X
2 1
interfaces 1d AS3, X
at 1a, 1d AS2,AS3,X 2a 2c

2d

dest interface  recall: 1a, 1b, 1d learn via iBGP from 1c: “path to X goes through 1c”
… …
1c 1
 at 1d: OSPF intra-domain routing: to get to 1c, use interface 1
X 1  at 1d: to get to X, use interface 1
… …
BGP: populating forwarding tables
AS 3 3b
AS 1 1b 3a 3c
1
1a 1c AS 2 3d X
2 2b
1d
2a 2c

2d

dest interface
… …  recall: 1a, 1b, 1d learn via iBGP from 1c: “path to X goes through 1c”
1c 2  at 1d: OSPF intra-domain routing: to get to 1c, use interface 1
X 2
… …  at 1d: to get to X, use interface 1
 at 1a: OSPF intra-domain routing: to get to 1c, use interface 2
 at 1a: to get to X, use interface 2
Hot potato routing
AS 3 3b
AS 1 1b 3a 3c
1a 1c AS 2 3d X
2b 112
1d AS1,AS3,X AS3,X
2a 2c
201 263

2d
OSPF link weights

 2d learns (via iBGP) it can route to X via 2a or 2c


 hot potato routing: choose local gateway that has least intra-domain
cost (e.g., 2d chooses 2a, even though more AS hops to X): don’t worry
about inter-domain cost!
Network Layer: 5-103
BGP: achieving policy via advertisements
A,w
B provider
x network
w A legend:
A,w C y customer
network:

ISP only wants to route traffic to/from its customer networks (does not want
to carry transit traffic between other ISPs – a typical “real world” policy)
 A advertises path Aw to B and to C
 B chooses not to advertise BAw to C!
 B gets no “revenue” for routing CBAw, since none of C, A, w are B’s customers
 C does not learn about CBAw path
 C will route CAw (not using B) to get to w
Network Layer: 5-104
BGP: achieving policy via advertisements (more)

B provider
x network
w A legend:
C y customer
network:

ISP only wants to route traffic to/from its customer networks (does not want
to carry transit traffic between other ISPs – a typical “real world” policy)
 A,B,C are provider networks
 x,w,y are customer (of provider networks)
 x is dual-homed: attached to two networks
 policy to enforce: x does not want to route from B to C via x
 .. so x will not advertise to B a route to C
Network Layer: 5-105
BGP route selection
• router may learn about more than one route to destination
AS, selects route based on:
1. local preference value attribute: policy decision
2. shortest AS-PATH
3. closest NEXT-HOP router: hot potato routing
4. additional criteria

Network Layer: 5-106


Why different Intra-, Inter-AS routing ?
policy:
 inter-AS: admin wants control over how its traffic routed, who
routes through its network
 intra-AS: single admin, so policy less of an issue
scale:
 hierarchical routing saves table size, reduced update traffic
performance:
 intra-AS: can focus on performance
 inter-AS: policy dominates over performance

Network Layer: 5-107

You might also like