Multicast Overview
Multicast Overview
Brian Edwards
Juniper Networks
Tech Support
[email protected]
February 20, 2001
Several protocols are required to enable IP multicast over multiple domains. These protocols, as
well as their documentation, have been developed independently, and each has its own set of
specific terms. The multiplicity of protocol development makes it difficult to create a workable,
integrated multicast implementation.
Thus, the point of this chapter is to resolve the difficulty of implementation using these several
protocols, and to provide a working-level illustration of an interdomain multicast routing system,
end to end across multiple domains.
Figure 2-1 shows a simple network, and will serve as a reference for subsequent discussion in this
chapter. The diagram shows two interconnected autonomous systems (AS) with fully functional
unicast routing.
Each autonomous system is controlled by an independent organization and has its own IGP.
Based on information gained from the independent IGPs, Router-C and Router-D, in the diagram,
each run an EBGP session to exchange routing information with Router-E and Router-F.
Figure 2-1. Base Internetwork
We take the reader step-by-step, showing how to enable these two autonomous systems to route
multicast traffic between them. Each step will concentrate on a portion of this diagram and
describe the mechanisms that need to be put in place for operational multicast routing.
Specifically, the focus will be on allowing Host-B to receive multicast traffic from Server-A.
To avoid a condition in which all hosts bombard the local network with redundant information,
two strategies are used:
1) When a query is received, a host waits a random amount of time to respond for each group
in which it is interested.
2) The host, in its IGMP report message responses, sets the destination address to the group
address being reported. If a host receives a report from a different host on the subnet, it
suppresses the sending of its own report for that group.
IGMP version 2, with addition of a leave-group message, enables hosts to report they are no
longer interested in a group. The router responds to this message with a group-specific query
message to determine if any other hosts are still interested in the group.
IGMP version 3 enables hosts to participate in source-specific multicast routing (SSM).
Source-specific multicast routing is described in Chapter 8.
1
In reality, there is no such thing as a PIM JOIN message. It is officially named a Join/Prune
message, because the same message can hold information for joining and pruning various
distribution trees. But for clarity, we will refer to Join/Prune messages as either “Join” or “Prune,”
depending on the function of the message in a given instance.
The process of building the RPT for 230.1.1.1 is illustrated in the following diagram.
It is easy to see the meaning of Reverse Path Forwarding with this example. The JOIN
messages are forwarded in the reverse direction of the path from the RP to Host-A. The
inbound interface of a JOIN message is added to the outbound interface list for forwarding
multicast data packets.
At this point no traffic is flowing, because Server-A has yet to start sending data packets to
the 230.1.1.1 group. In phase 2, the distribution tree is built to deliver packets from the
source to the RP.
Phase 2:
Building the Distribution Tree That Delivers Packets from Source to RP
Once the RPT is built, it remains in place even if no active sources exist to generate traffic to
the group. As soon as a source emerges, its traffic is instantly delivered to the receivers
although no end-to-end distribution tree exists from the source to all receivers.
This is accomplished as the PIM DR directly connected to the source encapsulates the data in
PIM REGISTER messages, and sends these PIM REGISTER messages via unicast routing to
the RP address. The PIM DR connected to the source sends a REGISTER message each time
it receives a multicast packet from the source. If an RP receives a PIM REGISTER message
for a group for which it has set up an existing RPT, then it does two things:
1) Delivers the encapsulated multicast packet down the RPT to the receivers
2) Sends a PIM JOIN message towards the source to create a distribution tree
When the distribution tree is set up, the RP begins to receive duplicate multicast packets.
One copy is delivered via multicast routing down the newly created distribution tree, and the
other is decapsulated from the REGISTER messages sent by the PIM DR.
As soon as the RP receives the first native multicast packet for the source-group pair, it sends
a REGISTER-STOP message to the PIM DR. When the PIM DR receives the REGISTER-
STOP messages, it stops sending REGISTER messages for the source-group pair.
Phase 2 on the Example Network
Phase 2 begins when Server-A starts sending data packets with a destination address of
230.1.1.1. Router-A recognizes these packets as being sent by a directly connected source on
a LAN that Router-A is serving as the PIM DR. Router-A notes that this is the first packet
received from this source for the group, because it has no existing (Server-A, 230.1.1.1) state.
Router-A creates this state, adding its Ethernet interface as the upstream interface for the
source-group pair. It then unicasts a PIM REGISTER message to the RP. The REGISTER
message has the data packet encapsulated within it. Router-A sends a separate REGISTER
message to Router-B for every data packet from Server-A to 230.1.1.1 until it receives a
REGISTER-STOP message from Router-B.
Router-B receives each REGISTER message, decapsulates the data packet, and sends it down
the RPT (which was set up in Phase 1). The initial REGISTER message received by Router-
B causes it to send a JOIN message to its RPF neighbor for Server-A. In this case Router-A
is that RPF neighbor. Router-A adds its interface towards Router-B to its outbound interface
list for the source-group pair.
At this point each packet sent from Server-A to 230.1.1.1 is sent two times to the RP. It is
both encapsulated in a REGISTER message and sent directly out the interface in the
outbound interface list. Another way to describe this is that the outbound interface list has
two entries; (1) the point-to-point interface connecting Router-B and (2) the virtual interface
formed by encapsulating data in REGISTERs.
Once Router-B starts receiving the data packets natively (i.e. not encapsulated in REGISTER
messages) it sends a REGISTER-STOP message to Router-A. When Router-A receives the
REGISTER-STOP is removes the virtual interface formed by encapsulating data in
REGISTERs from it outbound interface list and only forwards the packets natively.
The following diagram illustrates Phase 2.
Figure 2.3. Phase 2
When Phase 1 and 2 are completed the packets sent to 230.1.1.1 by Server-A are traveling to
Host-A via the RP. The packets are delivered successfully, but the shortest path through the
network is not being used. Phase 3 allows for these packets to be shortcut directly from
Router-A to Router-C.
Phase 3:
Building the SPT that Delivers Packets Directly from the Source to the
Interested Listeners
This is the final phase. The benefit of creating the SPT is that the path taken via the RP to
initially deliver traffic may not be the optimum path from the source to each receiver.
Once a PIM DR for a subnet with one or more interested listeners starts receiving multicast
packets from a particular source, it initiates the creation of the SPT by sending a PIM JOIN
message to its RPF neighbor for the source’s IP address.
When the SPT is formed, the PIM DR will start receiving two copies of each packet sent by
the source. One copy is received via the newly created SPT and the other is delivered via the
RPT. To avoid this redundancy, the PIM DR sends a PIM PRUNE message toward the RP.
This informs the RP that it is no longer necessary to for multicast packets for this source-
group pair down the RPT.
At this point as long as the sources and receivers remain static, PIM’s task of setting up the
optimal delivery of packets from the source to all receivers for the multicast group is finished.
Remember, the mechanisms discussed in this section only work if the sources and receivers
are all in the same PIM domain: the PIM DRs for all sources and receivers agree on the same
IP address for the RP of the multicast group.
Phase 3 on the Example Network
Phase 3 on the example network starts when Router-C receives the initial data packet for the
source-group pair, (Server-A, 230.1.1.1). Router-C knows that it received this packet down
the RPT and on its RPF interface for the RP. Router-C initiates the creation of the SPT, by
forwarding a (Server-A, 230.1.1.1) JOIN message to its RPF neighbor for Server-A. Its RPF
neighbor for Server-A is Router-A.
Router-A adds the interface connecting it to Router-C to its outbound interface list for
(Server-A, 230.1.1.1). At this point Router-A forwards the data packets out both of its point-
to-point interfaces. Router-C receives the packets twice; once directly from Router-A down
the SPT, and secondly down the RPT from Router-B.
When Router-C receives the first packet down the SPT, it sends a (Server-A, 230.1.1.1, RPT)
PRUNE message to its RPF neighbor for the RP. Its RPF neighbor for the RP is Router-B.
When Router-B receives this PRUNE message, it removes its point-to-point interface
connecting to Router-C from its outbound interface list for (*, 230.1.1.1).
Because Router-B now has no interfaces on this outbound interface list, it sends a (Server-A,
230.1.1.1) PRUNE message to its RPF neighbor for Server-A (Router-A). Router-A receives
the PRUNE message and removes its interface connecting Router-B from its outbound
interface list for (Server-A, 230.1.1.1).
The following diagram illustrates the control messages for phase 3.
At this point Router-C has two interfaces in its outbound interface list for (Server-A,
230.1.1.1). Those are its Ethernet interface and its point-to-point interface connecting
Router-E. The SPT for the source-group pair successfully delivers data packets from Server-
A to both Host-A and Host-B.
Based on fewest AS hops, the optimal path for unicast traffic traveling from AS 100 to AS 500
is through AS 400. The problem, however, is AS 400 does not support multicast routing. If the
same unicast routing table used to forward unicast traffic is used for the RPF table in all routers
and multicast traffic must flow from AS 500 to AS 100, the AS 100 is compelled to use
suboptimal routing for its unicast traffic destined for AS 500. Unicast traffic from AS 100
destined for AS 500 would be forced to traverse the path across AS 200 and AS 300.
To circumvent this limitation, a table other than the one used for unicast forwarding must be
used for multicast RPF. The question is how to populate such a table: How are unicast routes
introduced into a separate RPF table, with next-hop information different from the table used
for unicast forwarding?
One solution is to configure static routes specifically for the RPF table. Note that static routing
for multicast RPF faces the same scalability limitations as static routing for unicast forwarding.
Those limitations being: lack of dynamic fail-over and maintenance burden because changes to
topology are not automatically updated.
In real networks, it is desirable to update the entries in the RPF table dynamically. The RPF
table consists of unicast routes, so there is no need to invent a new routing protocol. Instead the
need is to somehow differentiate between route-control information intended to be used in the
unicast-forwarding routing table and the multicast RPF table.
Theoretically this differentiation could be implemented by modifying any of the existing
unicast routing protocols. The structure of some protocols, however, facilitates expanded
functionality. From the perspective of software developers, BGP is one of the easiest protocols
to which to add such functionality.
Within the BGP protocol, a capabilities negotiation occurs between peers when they first
establish a session. Rules have also been defined for handling BGP-learned routes with path
attributes that are not understood. In general, the BGP4 specification, RFC1771, is written with
expansion of the protocol’s capabilities in mind., for example, the Community attribute. (Note,
the Community attribute is not part of the BGP4 specification. It is defined in a later document,
RFC1997.)
Currently BGP is the only dynamic routing protocol that can differentiate between multiple
types of routing information. This capability is designated Multiprotocol Extensions for BGP
(MBGP) and is defined in RFC2283. MBGP works identically to BGP in all respects; it simply
adds functionality to BGP, such as the capability for BGP updates to tag routing information as
belonging to a specific protocol or function within that protocol.
When using MBGP for updating dedicated multicast RPF tables, two sets of routes are
exchanged in the MBGP updates:
! IPv4 unicast routes
! IPv4 multicast RPF routes
Each set will most likely have duplicated prefixes, but the path information for the same prefix
in each set can be different. Not only can multicast RPF routes have different BGP next-hops
(and therefore potentially different recursive next-hops), they can also have different
information in any of the BGP path attributes.
From the AS 100 perspective in the Figure 2-2 above, MBGP allows for destinations in AS 500
to be learned through the connection to both AS 200 and AS 400. The path through AS 400
will be preferred for unicast packet forwarding, with the path through AS 200 and AS 300
serving as backup. Meanwhile, path selection for multicast RPF routes, will be limited to the
path through AS 200 and AS 300.