Segment Routing
Segment Routing
Segment Routing
A tutorial
• Paresh Khatri
• DD-02-2016
1. Introduction
2. Use cases and applicability
3. Deployment options
4. IGP extensions for segment routing (if time permits)
SCALE RSVP-TE
• SPRING (Source Packet Routing in Networking) Working Group addresses the following:
- IGP-based MPLS tunnels without the addition of any other signaling protocol
• The ability to tunnel services (VPN, VPLS, VPWS) from ingress PE to egress PE with or without
an explicit path, and without requiring forwarding plane or control plane state in intermediate
nodes.
- Fast Reroute
• Any topology, pre-computation and setup of backup path without any additional signaling.
• Support of shared-risk constraints, support of link/node protection, support of micro-loop
avoidance.
• SPRING (Source Packet Routing in Networking) Working Group addresses the following:
- Traffic Engineering
• The soft-state nature of RSVP-TE exposes it to scaling issues; particularly in the context of SDN
where traffic differentiation may be done at a finer granularity.
• Should include loose/strict options, distributed and centralised models, disjointness, ECMP-
awareness, limited (preferably zero) per-service state on midpoint and tail-end routers.
- All of this should allow incremental and selective deployment with minimal disruption
Segment Segment
Segment
R5 R6
Segment Segment
11 © Nokia 2016
Operations on segments
- NEXT: the active segment is completed; the next segment becomes active.
- CONTINUE: the active segment is not completed and hence remains active.
• MPLS instantiation of Segment Routing aligns with the MPLS architecture defined in RFC
3031
• For each segment, IGP advertises an identifier referred to as a Segment ID (SID). A SID
is a 32-bit entity; with the MPLS label being encoded as the 20 right-most bits of the
segment
13 © Nokia 2016
Segment routing with MPLS data plane (2)
• When Segment Routing is instantiated over the MPLS data-plane, the following actions
apply :
- A list of segments is represented as a stack of labels
- The active segment is the top label
- The CONTINUE operation is implemented as a SWAP operation
- The NEXT operation is implemented as a POP operation
- The PUSH operation is implemented as a PUSH operation
14 © Nokia 2016
Types of segments
Prefix Segment Adjacency Segment BGP Prefix Segment BGP Peer Segment
• Globally unique – • Locally unique – each • Example Prefix Segment • EPE ; Egress Peering
allocated from SRGB SR router in the in DC environment Engineering
domain can use the • DC GW representation
• Typically multi-hop • Influence how to control
same space traffic to adjacent AS
• ECMP-aware shortest- • Signaled by BGP (in DC)
path IGP route to a • Typically single-hop • Signaled by BGP-LS (w/
related prefix • Signaled by IGP EPE controller)
• Indexing or absolute
SID
DC CORE/
• Signaled by IGP
WAN
AS2
CORE/
WAN
AS1 CORE/
WAN
AS3
15 © Nokia 2016 Public
Segment identifiers – prefix segments
* In an MPLS architecture, SRGB is the set of local labels reserved for global segments.
• PE2 advertises Node Segment into IGP (Prefix-SID Sub-TLV Extension to IS-IS/OSPF)
• All routers in SR domain install the node segment to PE2 in the MPLS data-plane.
- No RSVP and/or LDP control plane required.
- When applied to MPLS, a Segment is essentially an LSP.
PHP based on p-bit setting of
Prefix-SID advertised by PE2
SWAP SWAP
FEC PE2 800 to 800 800 to 800 POP 800
PUSH 800
Node-SID 100 Node-SID 200 Node-SID 300 Node-SID 400
PE1 P1 P2 P3 PE2
Node-SID
800
• For traffic from PE1 to PE2, PE1 pushes on node segment {800} and uses shortest IGP
path to reach PE2.
• Active segment is the top of the stack for MPLS:
- P1 and P2 implement CONTINUE (swap) action in MPLS data-plane
- P3 implements NEXT (pop) action (based on P-bit in Prefix-SID not PHP based on p-bit setting
being set). of Prefix-SID advertised by
SWAP SWAP PE2
FEC PE2
800 to 800 800 to 800 POP 800
PUSH 800
Node-SID 100 Node-SID 200 Node-SID 300 Node-SID 400
PE1 P1 P2 P3 PE2
800 800 800 Packet
Packet Node-SID
Packet Packet
800
• No state held in network with the exception of segment list for tunnel held at PE1.
• The use of absolute SID values requires a single consistent SRGB on all SR routers
throughout the IGP domain.
• Example: Node-SID Node-SID
200 300
- PE2 advertises MP-BGP label
910 for VPN prefix Z. Node-SID
P1 P2
Node-SID
- To forward traffic to VPN prefix 100
MP-BGP
600
Z, and again assuming Label 910
preferred (non-ECMP) path A CE1 PE1 PE2 CE2 Z
from PE1 to PE2 is PE1-P3-P4- Packet
hop by hop.
21 © Nokia 2016 Public
Prefix SID indexing
• Why ?
- SR domain can be multi-vendor w/ possibility that each vendor uses a different MPLS label range
- Prefix SID must be globally unique within SR domain
• How ?
- Indexing mechanism is required for prefix SIDs. All routers within the SR domain are expected to
configure and advertise the same prefix SID index range for a given IGP instance.
- The label value used by each router to represent a prefix ‘Z’ (= label programmed in ILM) can be
local to that router by the use of an offset label, referred to as a start label :
Local Label (for Prefix SID) = (local) start-label + {Prefix SID index}
• For example, assume the SRGB is {1000, 1999}, and the SID Index Range is {1,100}.
• Each SR router in the domain defines a start point in the SRGB (start-label), and an offset
label called an SID index.
Node-SID 200 Node-SID 300
- SR routers sum {start-label + SID Start-Label 1050 Start-Label 1040
index} to obtain a local label for a
Prefix SID. P1 P2
Node-SID 100 Node-SID 600
- Assuming PE2 advertises Start-Label 1060 MP-BGP Start-Label 1010
Label 910
loopback 192.0.2.2/32 with a prefix
A CE1 PE1 PE2 CE2 Z
index of 2:
- PE2’s SID is {1010+2}= 1012
Packet
1032
910 P3 P4
- P4’s SID is {1020+2}= 1022 Packet
1001 1001
Packet
26 © Nokia 2016 Public
Example: SR tunnel with node and adjacency segments
300
800 300 P1 P2 P3
• PE1 therefore imposes Packet 300 Adj-SID 1003
the segment list {300, 800
1003, 800} representing PE1 PE2
the the Node-SID for P2, Node-SID 800 Node-
the Adj-SID for link P2- 100
P6
SID 800
P4 P5
P5, and finally the Node-
Node-SID 700
SID for PE2. Node-SID 500 Node-SID 600
SWAP 800 POP 800
800 Packet
Packet
28 © Nokia 2016 Public
Comparison with LDP and RSVP-TE
LDP RSVP-TE SR
Overview Multipoint to point Point to point Multipoint to point
Operation Simple LSP per destination/TE- Simple
path
Dependencies Relies on IGP Relies on IGP TE Relies on IGP + offline TE
Packet
800
Packet
explicit-route on a hop-by-hop 300 P1 Node-SID 300 Node-SID 400
basis, but has the potential to 600
• Disjointness describes two (or more) services that must be completely disjoint of each
other. They should not share common network infrastructure – i.e. if one fails, the other
must always be active.
• Many networks employ the ‘dual-plane’ design, where inter-plane links are configured
such that the route to a destination stays on that plane during a single failure scenario.
• Disjointness can broadly be achieved using Anycast segments.
• Assume service 1 between PE1 and PE3 must be disjoint from service 2 between PE2
and PE4: Red Plane
Anycast SID 902
- Service 1 at PE1 has
Node-SID
segment list {902, 300}
P5 P6 300
including Anycast SID 902
and traverses the red plane Node-SID
100 PE3
before reaching PE3. P1 P2
Service 1
PE1
- Service 2 at PE2 has PE4
segment list {901, 400} P7 P8
including Anycast SID 901 Node-SID
Service 2 PE2
and traverses the blue plane P3 400
P4
before reaching PE4. Node-SID
200
Blue Plane
Anycast SID 901
EPE Controller
- 20% traffic to AS 200 with segment list {100, 1006} 1002 POP Link to R8
1003 POP Upper link to R9
- Prefix <NLRI/Length> segment list {100, 1003} 1004 POP Lower link to R9
- Prefix <NLRI/Length> segment list {100, 1004} 1005 POP Load-balance on any link to R9
1006 POP Load-balance on any link to R7 or R8
37 © Nokia 2016
Use case 5: Adjacency segment load-balancing (1)
10G
PE1 P1 P2 PE2
40G
800 Node-SID Node-SID
Node-SID Node-SID
300 800
100 200 800
38 © Nokia 2016
Use case 5: Adjacency segment load-balancing (2)
• Traffic Engineering information made available to CSPF for RSVP-TE based LSPs
can also be made available to SR tunnels
- Includes available link bandwidth, admin-groups, shared-risk link groups (SRLGs) etc.
• In the example topology,
Node-SID 100 Node-SID 200 Node-SID 300 Node-SID 800
assume that link P1-P2 is in
SRLG 1
SRLG 1. PE1 P1 P2 PE2
- The SRLG information is
flooded into IS-IS (RFC 4874)
or OSPF (RFC 4203).
P3 P4
40 © Nokia 2016
Use case 6: Distributed cspf-based traffic engineering (2)
• If an MPLS Control Plane Client (i.e. LDP, RSVP, BGP, SR) installs forwarding entries into
the MPLS data-plane, those entries need to be unique in order to function as “Ships in the
Night”.
• It’s also likely that these control planes can and will co-exist. For example, LDP and SR
could co-exist, where:
- LDP and SR are present on all routers in the network.
Preference for LDP or SR for service tunnels is a local
matter at the head-end. SR can also be used to
enhance FRR coverage.
- SR is only present in parts of the network. LDP and SR
can be interworked to provide an end-to-end tunnel
and/or an FRR tunnel due to the presence of an SR
Mapping Server (SRMS).
• Requirements:
- Service 1 to be tunneled MP-BGP
Label 910
via LDP PE1 PE3 LDP-only
R router
- Service 1 to be tunneled Service 1 SR-
via SR Node-SID
R only
Node-SID
101 P1 P2 P3 103
router
• Outcome: MP-BGP
Label 910
- Service 1 is tunneled
from PE1 to PE3 through 423
102
P2 and P3. Service
MP-BGP
Label 860
52 © Nokia 2016 Public
Segment routing and LDP inter-operability
Scenario 1: Ships-in-the-night co-existence (cont.)
• Possible to have multiple entries in
MP-BGP
the MPLS data plane for the same Label 910
prefix.
423 Loopback:
700 819 910 192.0.2.203/32
910
910 910 Packet
PE1 Packet
Packet
PE3 R LDP-only
Node P1’s MPLS forwarding table Packet router
Service 1 SR-
FEC Incomin Outgoing Next-
g Label Label Hop
R only
Node-SID Node-SID
101 P1 P2 P3 103
router
192.0.2.203/32 (LDP) 423 700 P2 LDP+SR
R router
192.0.2.203/32 (SR) 204 204 4
Service 2 Node-SID
102 Service
MP-BGP
Label 860
53 © Nokia 2016 Public
Segment routing and LDP inter-operability
Scenario 2: Migration from LDP to SR
• Stage 1:
- All routers initially run only LDP. All
services are tunneled from the ingress
PE to the egress PEs over a continuous
LDP LSP.
PE1 PE3
P5 P6 P7
PE2 R LDP-only
router
PE4
SR-
R only
router
LDP+SR
R router
Service
Service
Node-SID Node-SID
105 P5 P6 P7 107
PE2 R LDP-only
router
PE4
Node-SID 102 SR- Node-SID 104
R only
router
LDP+SR
R router
Service
PE2 R LDP-only
router
PE4
Node-SID 102 SR- Node-SID 104
R only
router
LDP+SR
R router
Service
PE1 PE3
Node-SID
106
Node-SID Node-SID
105 P5 P6 P7 107
PE2 R LDP-only
router
PE4
Node-SID 102 SR- Node-SID 104
R only
router
LDP+SR
R router
Service
routers A, B, and C. 10 R
SR-
only
10 1
- Router A has services to B 0
router
LDP+SR
R
and C. LDP is the preferred R4 10 R5 10 R6 30 R7 router
10
10 1 SR-
R only
0 router
Node-SID R6 R7
R4 10 R5 10 30 R LDP+SR
104 router
Service 1
Node-SID 105 Node-SID 106 Node-SID 107 (A-B)
Service 2
(A-B)
Prefix-SID sub-TLV
Type Length Flags Algorithm
• Introduction of Prefix-SID SUB- SID/Index/Label (variable)
TLV, which may be present in
either: Flag Meaning
- TLV-135 (IPv4), TLV-235 (MT-IPv4) Re-advertisement flag. If set, the prefix to which this Prefix-SID is
R-flag attached has been propagated by the router either from another
- TLV-236 (IPv6), TLV-237 (MT-IPv6) level (L2 to L1 or vice-versa) or from redistribution.
• SID/Index/Label contains either: Node-SID flag. If set, the Prefix-SID refers to the router identified
by the prefix (router loopback/system address). The prefix to
N-flag
- A 32-bit index defining the offset in which the SID is attached must have a prefix length of /32 (IPv4)
or /128 (IPv6)
the SID/Label space advertised by
No-PHP flag. If set, the penultimate hop must not pop the Prefix-
this router P-flag
SID before delivering the packet to the advertising router.
- A 24-bit label, where the 20 Explicit-Null flag. If set, any upstream neighbour of the Prefix-SID
E-flag originator must replace the Prefix-SID with a Prefix-SID having
rightmost bits are used for an Explicit-Null value before forwarding the packet.
encoding the label value Value flag. If set, the Prefix-SID carries an absolute value
V-flag
- A variable length SID (i.e. An IPv6 (instead of an index)
single Adjacency, or the same Adj-SID can Backup flag. If set, the Adj-SID refers to an adjacency that is
being protected (using Fast Reroute techniques). This allows a
B-flag
be allocated to multiple adjacencies. head-end SR router to select only links that are protected
throughout the domain if the SLA for the SR tunnel dictates this.
- Used for load-balancing
V-flag Value flag. If set, the Adj=SID carries a value (default is set).
• SID/Index/Label contains either:
- A 32-bit index defining the offset in the SID/ L-flag
Local flag. If set, the value/index carried by the Prefix-SID has
local significance.
Label space advertised by this router
Set Flag. When set, indicates that the Adj-SID refers to a set of
- A 24-bit label, where the 20 rightmost bits are S-flag adjacencies (and therefore may be assigned to other
adjacencies as well).
used for encoding the label value
- A variable length SID (i.e. An IPv6 address SID)
70 © Nokia 2016 Public
IS-IS extensions
FM
SID/label binding TLV
• May be originated by any router in Type Length Flags Weight
• Algorithm specifies algorithm the Prefix-SID Node-SID flag. If set, the Prefix-SID refers to the router identified
by the prefix (router loopback/system address). The prefix to
is associated with: N-flag
which the SID is attached must have a prefix length of /32 (IPv4)
or /128 (IPv6)
• May also be carried in SR-Algorithm TLV of No-PHP flag. If set, the penultimate hop must not pop the Prefix-
P-flag
Router Information Opaque LSA. SID before delivering the packet to the advertising router.
No-PHP flag. If set, the penultimate hop must not pop the Prefix-
– A 32-bit index defining the offset in the SID/ P-flag
SID before delivering the packet to the advertising router.
Label space advertised by this router
Mapping Server Flag. If set, the SID is advertised from the
M-flag
– A 24-bit label, where the 20 rightmost bits Segment Routing Mapping Server.
are used for encoding the label value Explicit-Null flag. If set, any upstream neighbour of the Prefix-SID
E-flag originator must replace the Prefix-SID with a Prefix-SID having
an Explicit-Null value before forwarding the packet.