0% found this document useful (0 votes)
43 views72 pages

CN-Module All Modules Notes

Module 3 covers the network layer's role in communication across interconnected networks, detailing its functions such as packetizing, routing, and forwarding. It explains the differences between connectionless and connection-oriented services, including packet switching techniques like datagram and virtual-circuit approaches. The document also discusses network layer services, including error control, flow control, congestion control, quality of service, and security concerns.

Uploaded by

extraworkuse123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views72 pages

CN-Module All Modules Notes

Module 3 covers the network layer's role in communication across interconnected networks, detailing its functions such as packetizing, routing, and forwarding. It explains the differences between connectionless and connection-oriented services, including packet switching techniques like datagram and virtual-circuit approaches. The document also discusses network layer services, including error control, flow control, congestion control, quality of service, and security concerns.

Uploaded by

extraworkuse123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

Module 3: Network Layer

Communication at Network Layer


• The figure shows that the Internet is made of many networks (or links) connected through
the connecting devices. In other words, the Internet is an internetwork, combination of
LANs and WANs.
• To better understand the role of the network layer (or the internetwork layer), we need to
think about the connecting devices (routers or switches) that connect the LANs and WANs.
• As the figure shows, the network layer is involved at the source host, destination host, and
all routers in the path (R2, R4, R5, and R7).
• At the source host (Alice), the network layer accepts a packet from a transport layer,
encapsulates the packet in a datagram, and delivers the packet to the data-link layer.
• At the destination host (Bob), the datagram is decapsulated, and the packet is extracted and
delivered to the corresponding transport layer.
• Although the source and destination hosts are involved in all five layers of the TCP/IP suite,
the routers use three layers if they are routing packets only; however, they may need the
transport and application layers for control purposes.
• A router in the path is normally shown with two data-link layers and two physical layers,
because it receives a packet from one network and delivers it to another network.

The Services of Network layer are:


• Packetizing
• Routing and Framing
• Other Services
• Error control
• Flow Control
• Connection Control
• Quality of service
• Security
Packetizing
• The first duty of the network layer is definitely packetizing: encapsulating the payload
(data received from upper layer) in a network-layer packet at the source and decapsulating
the payload from the network-layer packet at the destination.
• The source is not allowed to change the content of the payload unless it is too large for
delivery and needs to be fragmented.
• The destination host receives the network-layer packet from its data-link layer,
decapsulates the packet, and delivers the payload to the corresponding upper-layer
protocol.
• If the packet is fragmented at the source or at routers along the path, the network layer is
responsible for waiting until all fragments arrive, reassembling them, and delivering them
to the upper-layer protocol.
• The routers in the path are not allowed to decapsulate the packets they received unless the
packets need to be fragmented.
• The routers are not allowed to change source and destination addresses either.
• They just inspect the addresses for the purpose of forwarding the packet to the next network
on the path.
• However, if a packet is fragmented, the header needs to be copied to all fragments and
some changes are needed.
Routing and Forwarding
• Routing : The network layer is responsible for routing the packet from its source to the
destination. A physical network is a combination of networks (LANs and WANs) and
routers that connect them.
• This means that there is more than one route from the source to the destination. The
network layer is responsible for finding the best one among these possible routes.
• The network layer needs to have some specific strategies for defining the best route.
• In the Internet today, this is done by running some routing protocols to help the routers
coordinate their knowledge about the neighborhood and to come up with consistent tables
to be used when a packet arrives.
• Forwarding: If routing is applying strategies and running some routing protocols to create
the decision-making tables for each router, forwarding can be defined as the action applied
by each router when a packet arrives at one of its interfaces.
• The decision-making table a router normally uses for applying this action is sometimes
called the forwarding table and sometimes the routing table.
• When a router receives a packet from one of its attached networks, it needs to forward the
packet to another attached network (in unicast routing) or to some attached networks (in
multicast routing).
• To make this decision, the router uses a piece of information in the packet header, which
can be the destination address or a label, to find the corresponding output interface number
in the forwarding table.

Forwarding Process
Error Control
• The error control also can be implemented in the network layer, the designers of the
network layer in the Internet ignored this issue for the data being carried by the network
layer.
• One reason for this decision is the fact that the packet in the network layer may be
fragmented at each router, which makes error checking at this layer inefficient.
• The designers of the network layer, however, have added a checksum field to the datagram
to control any corruption in the header, but not in the whole datagram.
• This checksum may prevent any changes or corruptions in the header of the datagram.
• We need to mention that although the network layer in the Internet does not directly provide
error control, the Internet uses an auxiliary protocol, ICMP, that provides some kind of
error control if the datagram is discarded or has some unknown information in the header.
Flow Control
• Flow control regulates the amount of data a source can send without overwhelming the
receiver. If the upper layer at the source computer produces data faster than the upper layer
at the destination computer can consume it, the receiver will be overwhelmed with data.
• The network layer in the Internet, however, does not directly provide any flow control.
• The datagrams are sent by the sender when they are ready, without any attention to the
readiness of the receiver.
• A few reasons for the lack of flow control in the design of the network layer can be
mentioned.
• First, since there is no error control in this layer, the job of the network layer at the receiver
is so simple that it may rarely be overwhelmed.
• Second, the upper layers that use the service of the network layer can implement buffers
to receive data from the network layer as they are ready and do not have to consume the
data as fast as it is received.
• Third, flow control is provided for most of the upper-layer protocols that use the services
of the network layer, so another level of flow control makes the network layer more
complicated and the whole system less efficient.
Congestion Control
• Another issue in a network-layer protocol is congestion control.
• Congestion in the network layer is a situation in which too many datagrams are present in
an area of the Internet.
• Congestion may occur if the number of datagrams sent by source computers is beyond the
capacity of the network or routers. In this situation, some routers may drop some of the
datagrams.
• However, as more datagrams are dropped, the situation may become worse because, due
to the error control mechanism at the upper layers, the sender may send duplicates of the
lost packets.
• If the congestion continues, sometimes a situation may reach a point where the system
collapses and no datagrams are delivered.
• Congestion control at the network layer later is not implemented in the Internet.
Quality of Service
• As the Internet has allowed new applications such as multimedia communication (in
particular real-time communication of audio and video), the quality of service (QoS) of the
communication has become more and more important.
• The Internet has thrived by providing better quality of service to support these applications.
However, to keep the network layer untouched, these provisions are mostly implemented
in the upper layer.
Security
• Another issue related to communication at the network layer is security.
• Security was not a concern when the Internet was originally designed because it was used
by a small number of users at universities for research activities; other people had no access
to the Internet.
• The network layer was designed with no security provision. Today, however, security is a
big concern. To provide security for a connectionless network layer, we need to have
another virtual level that changes the connectionless service to a connection-oriented
service.

Packet Switching
• A router, in fact, is a switch that creates a connection between an input port and an output
port (or a set of output ports), just as an electrical switch connects the input to the output
to let electricity flow.
• Although in data communication switching techniques are divided into two broad
categories, circuit switching and packet switching, only packet switching is used at the
network layer because the unit of data at this layer is a packet.
• Circuit switching is mostly used at the physical layer; the electrical switch mentioned
earlier is a kind of circuit switch.
• At the network layer, a message from the upper layer is divided into manageable packets
and each packet is sent through the network.
• The source of the message sends the packets one by one; the destination of the message
receives the packets one by one.
• The destination waits for all packets belonging to the same message to arrive before
delivering the message to the upper layer.
• The connecting devices in a packet-switched network still need to decide how to route the
packets to the final destination.
• Today, a packet-switched network can use two different approaches to route the packets:
the datagram approach and the virtual circuit approach.
Datagram Approach
• Connectionless Service When the Internet started, to make it simple, the network layer was
designed to provide a connectionless service in which the network-layer protocol treats
each packet independently, with each packet having no relationship to any other packet.
• The idea was that the network layer is only responsible for delivery of packets from the
source to the destination.
• In this approach, the packets in a message may or may not travel the same path to their
destination.
• When the network layer provides a connectionless service, each packet traveling in the
Internet is an independent entity; there is no relationship between packets belonging to the
same message. The switches in this type of network are called routers.
• A packet belonging to a message may be followed by a packet belonging to the same
message or to a different message.
• A packet may be followed by a packet coming from the same or0020 from a different
source.

A connectionless packet-switched network


• Each packet is routed based on the information contained in its header source and
destination addresses.
• The destination address defines where it should go; the source address defines where it
comes from.
• The router in this case routes the packet based only on the destination address.
Forwarding process in a router when used in a connectionless network

Virtual-Circuit Approach
• Connection-Oriented Service In a connection-oriented service (also called virtual-circuit
approach), there is a relationship between all packets belonging to a message.
• Before all datagrams in a message can be sent, a virtual connection should be set up to
define the path for the datagrams.
• After connection setup, the datagrams can all follow the same path. In this type of service,
not only must the packet contain the source and destination addresses, it must also contain
a flow label, a virtual circuit identifier that defines the virtual path the packet should follow.
Shortly, we will show how this flow label is determined, but for the moment, we assume
that the packet carries this label.
• Although it looks as though the use of the label may make the source and destination
addresses unnecessary during the data transfer phase, parts of the Internet at the network
layer still keep these addresses.
• One reason is that part of the packet path may still be using the connectionless service.
• Another reason is that the protocol at the network layer is designed with these addresses,
and it may take a while before they can be changed. The concept of connection-oriented
service shown below.

A virtual-circuit packet-switched network


• Each packet is forwarded based on the label in the packet.
• To follow the idea of connection-oriented design to be used in the Internet, we assume that
the packet has a label when it reaches the router.
• In this case, the forwarding decision is based on the value of the label, or virtual circuit
identifier, as it is sometimes called.
• To create a connection-oriented service, a three-phase process is used setup, data transfer,
and teardown.
• In the setup phase, the source and destination addresses of the sender and receiver are used
to make table entries for the connection-oriented service.
• In the teardown phase, the source and destination inform the router to delete the
corresponding entries.
• Data transfer occurs between these two phases namely, Setup Phase and Acknowledgment
Phase.

Setup Phase
• In the setup phase, a router creates an entry for a virtual circuit.
• For example, suppose source A needs to create a virtual circuit to destination B.
• Two auxiliary packets need to be exchanged between the sender and the receiver the request
packet and the acknowledgment packet.

Forwarding process in a router when used in a virtual-circuit network


Request packet
• A request packet is sent from the source to the destination. This auxiliary packet carries the
source and destination addresses. The process shown below.
Sending request packet in a virtual-circuit network
1. Source A sends a request packet to router R1.
2. Router R1 receives the request packet. It knows that a packet going from A to B goes out through
port 3. How the router has obtained this information is a point covered later. For the moment,
assume that it knows the output port. The router creates an entry in its table for this virtual circuit,
but it is only able to fill three of the four columns. The router assigns the incoming port (1) and
chooses an available incoming label (14) and the outgoing port (3). It does not yet know the
outgoing label, which will be found during the acknowledgment step. The router then forwards the
packet through port 3 to router R3.
3. Router R3 receives the setup request packet. The same events happen here as at router R1; three
columns of the table are completed: in this case, incoming port (1), incoming label (66), and
outgoing port (3).
4. Router R4 receives the setup request packet. Again, three columns are completed: incoming port
(1), incoming label (22), and outgoing port (4).
5. Destination B receives the setup packet, and if it is ready to receive packets from A, it assigns a
label to the incoming packets that come from A, in this case 77. This label lets the destination
know that the packets come from A, and not from other sources.
Acknowledgment Packet
• A special packet, called the acknowledgment packet, completes the entries in the switching
tables. The process shown below.
Sending acknowledgments in a virtual-circuit network
1. The destination sends an acknowledgment to router R4. The acknowledgment carries the global
source and destination addresses so the router knows which entry in the table is to be completed.
The packet also carries label 77, chosen by the destination as the incoming label for packets from
A. Router R4 uses this label to complete the outgoing label column for this entry. Note that 77 is
the incoming label for destination B, but the outgoing label for router R4.
2. Router R4 sends an acknowledgment to router R3 that contains its incoming label in the table,
chosen in the setup phase. Router R3 uses this as the outgoing label in the table.
3. Router R3 sends an acknowledgment to router R1 that contains its incoming label in the table,
chosen in the setup phase. Router R1 uses this as the outgoing label in the table.
4. Finally router R1 sends an acknowledgment to source A that contains its incoming label in the
table, chosen in the setup phase.
5. The source uses this as the outgoing label for the data packets to be sent to destination B.

Data-Transfer Phase
• The second phase is called the data-transfer phase.
• After all routers have created their forwarding table for a specific virtual circuit, then the
network-layer packets belonging to one message can be sent one after another.
• The flow of a single packet, but the process is the same for 1, 2, or 100 packets.
• The source computer uses the label 14, which it has received from router R1 in the setup
phase. Router R1 forwards the packet to router R3, but changes the label to 66.
• Router R3 forwards the packet to router R4, but changes the label to 22.
• Finally, router R4 delivers the packet to its final destination with the label 77.
• All the packets in the message follow the same sequence of labels, and the packets arrive
in order at the destination.

Flow of one packet in an established virtual circuit


Teardown Phase
• In the teardown phase, source A, after sending all packets to B, sends a special packet called
a teardown packet.
• Destination B responds with a confirmation packet. All routers delete the corresponding
entries from their tables.

IPv4 Addresses
• The identifier used in the IP layer of the TCP/IP protocol suite to identify the connection
of each device to the Internet is called the Internet address or IP address.
• An IPv4 address is a 32-bit address that uniquely and universally defines the connection of
a host or a router to the Internet.
• The IP address is the address of the connection, not the host or the router, because if the
device is moved to another network, the IP address may be changed.
• IPv4 addresses are unique in the sense that each address defines one, and only one,
connection to the Internet.
• If a device has two connections to the Internet, via two networks, it has two IPv4 addresses.
• IPv4 addresses are universal in the sense that the addressing system must be accepted by
any host that wants to be connected to the Internet.
Address Space
• A protocol like IPv4 that defines addresses has an address space.
• An address space is the total number of addresses used by the protocol.
• If a protocol uses b bits to define an address, the address space is 2b because each bit can
have two different values (0 or 1). IPv4 uses 32-bit addresses, which means that the address
space is 232 or 4,294,967,296 (more than four billion).
• If there were no restrictions, more than 4 billion devices could be connected to the Internet.
IPV4 Notation
• There are three common notations to show an IPv4 address:
1. binary notation (base 2)
2. dotted-decimal notation (base 256)
3. hexadecimal notation (base 16).
• In binary notation, an IPv4 address is displayed as 32 bits.
• To make the address more readable, one or more spaces are usually inserted between each
octet (8 bits). Each octet is often referred to as a byte.
• To make the IPv4 address more compact and easier to read, it is usually written in decimal
form with a decimal point (dot) separating the bytes.
• This format is referred to as dotted-decimal notation. Note that because each byte (octet)
is only 8 bits, each number in the dotted-decimal notation is between 0 and 255.
• We sometimes see an IPv4 address in hexadecimal notation. Each hexadecimal digit is
equivalent to four bits.
• This means that a 32-bit address has 8 hexadecimal digits. This notation is often used in
network programming.

Three different notations in IPv4 addressing


Hierarchy in Addressing
• In any communication network that involves delivery, such as a telephone network or a
postal network, the addressing system is hierarchical.
• In a postal network, the postal address (mailing address) includes the country, state, city,
street, house number, and the name of the mail recipient. Similarly, a telephone number is
divided into the country code, area code, local exchange, and the connection.
• A 32-bit IPv4 address is also hierarchical, but divided only into two parts.
• The first part of the address, called the prefix, defines the network; the second part of the
address, called the suffix, defines the node (connection of a device to the Internet).
• The below diagram shows prefix and suffix of a 32-bit IPv4 address.
• The prefix length is n bits and the suffix length is (32 − n) bits.
• A prefix can be fixed length or variable length. The network identifier in the IPv4 was first
designed as a fixed-length prefix.
• This scheme, which is now obsolete, is referred to as classful addressing.
• The new scheme, which is referred to as classless addressing, uses a variable-length
network prefix.
• First, we briefly discuss classful addressing; then we concentrate on classless addressing.

Hierarchy in addressing

Classful Addressing
• When the Internet started, an IPv4 address was designed with a fixed-length prefix, but to
accommodate both small and large networks, three fixed-length prefixes were designed
instead of one (n = 8, n = 16, and n = 24).
• The whole address space was divided into five classes (class A, B, C, D, and E), as shown
below diagram.
• This scheme is referred to as classful addressing.
• In class A, the network length is 8 bits, but since the first bit, which is 0, defines the class,
we can have only seven bits as the network identifier.
• This means there are only 27 = 128 networks in the world that can have a class A address.
In class B, the network length is 16 bits, but since the first two bits, which are (10)2, define
the class, we can have only 14 bits as the network identifier.
• This means there are only 214 = 16,384 networks in the world that can have a class B
address. All addresses that start with (110)2 belong to class C.
• In class C, the network length is 24 bits, but since three bits define the class, we can have
only 21 bits as the network identifier.
• This means there are 221 = 2,097,152 networks in the world that can have a class C address.
Class D is not divided into prefix and suffix.
• It is used for multicast addresses. All addresses that start with 1111 in binary belong to class
E. As in Class D, Class E is not divided into prefix and suffix and is used as reserve.

Occupation of the address space in classful addressing

Address Depletion
• The reason that classful addressing has become obsolete is address depletion.
• Since the addresses were not distributed properly, the Internet was faced with the problem
of the addresses being rapidly used up, resulting in no more addresses available for
organizations and individuals that needed to be connected to the Internet.
• To understand the problem, let us think about class A.
• This class can be assigned to only 128 organizations in the world, but each organization
needs to have a single network (seen by the rest of the world) with 16,777,216 nodes
(computers in this single network).
• Since there may be only a few organizations that are this large, most of the addresses in
this class were wasted (unused).
• Class B addresses were designed for midsize organizations, but many of the addresses in
this class also remained unused.
• Class C addresses have a completely different flaw in design. The number of addresses that
can be used in each network (256) was so small that most companies were not comfortable
using a block in this address class.
• Class E addresses were almost never used, wasting the whole class.
Subnetting and Supernetting
• To alleviate address depletion, two strategies were proposed and, to some extent,
implemented: subnetting and supernetting.
• In subnetting, a class A or class B block is divided into several subnets. Each subnet has a
larger prefix length than the original network.
• For example, if a network in class A is divided into four subnets, each subnet has a prefix
of nsub = 10. At the same time, if all of the addresses in a network are not used, subnetting
allows the addresses to be divided among several organizations.
• This idea did not work because most large organizations were not happy about dividing the
block and giving some of the unused addresses to smaller organizations.
• While subnetting was devised to divide a large block into smaller ones, supernetting was
devised to combine several class C blocks into a larger block to be attractive to
organizations that need more than the 256 addresses available in a class C block.
• This idea did not work either because it makes the routing of packets more difficult.

Advantage of Classful Addressing


• Although classful addressing had several problems and became obsolete, it had one
advantage: Given an address, we can easily find the class of the address and, since the
prefix length for each class is fixed, we can find the prefix length immediately.
• In other words, the prefix length in classful addressing is inherent in the address; no extra
information is needed to extract the prefix and the suffix.

Classless Addressing
• In classless addressing, variable-length blocks are used that belong to no classes.
• We can have a block of 1 address, 2 addresses, 4 addresses, 128 addresses, and so on. In
classless addressing, the whole address space is divided into variable length blocks.
• The prefix in an address defines the block (network); the suffix defines the node (device).
Theoretically, we can have a block of 20, 21, 22, . . . , 232 addresses.
• The number of addresses in a block needs to be a power of 2. An organization can be
granted one block of addresses.
• As shows below the division of the whole address space into nonoverlapping blocks.
Variable-length blocks in classless addressing
• Unlike classful addressing, the prefix length in classless addressing is variable.
• We can have a prefix length that ranges from 0 to 32. The size of the network is inversely
proportional to the length of the prefix.
• A small prefix means a larger network; a large prefix means a smaller network.
• We need to emphasize that the idea of classless addressing can be easily applied to classful
addressing.
• An address in class A can be thought of as a classless address in which the prefix length is
8.
• An address in class B can be thought of as a classless address in which the prefix is 16, and
so on.
• In other words, classful addressing is a special case of classless addressing.

Prefix Length: Slash Notation


• The first question that we need to answer in classless addressing is how to find the prefix
length if an address is given.
• Since the prefix length is not inherent in the address, we need to separately give the length
of the prefix. In this case, the prefix length, n, is added to the address, separated by a slash.
• The notation is informally referred to as slash notation and formally as classless
interdomain routing or CIDR (pronounced cider) strategy.
• An address in classless addressing can then be represented as below

Slash notation (CIDR)

Extracting Information from an Address


Given any address in the block (given the prefix length n), the three pieces of information as
shown below:
Information extraction in classless addressing
1. The number of addresses in the block is found as N = 232−n.
2. To find the first address, we keep the n leftmost bits and set the (32 − n) rightmost bits all to 0s.
3. To find the last address, we keep the n leftmost bits and set the (32 − n) rightmost bits all to 1s.
Example:

Address Mask
• Another way to find the first and last addresses in the block is to use the address mask.
• The address mask is a 32-bit number in which the n leftmost bits are set to 1s and the rest
of the bits (32 − n) are set to 0s.
• A computer can easily find the address mask because it is the complement of (232 − n − 1).
• The reason for defining a mask in this way is that it can be used by a computer program to
extract the information in a block, using the three bit-wise operations NOT, AND, and OR.
1. The number of addresses in the block N = NOT (mask) + 1.
2. The first address in the block = (Any address in the block) AND (mask).
3. The last address in the block = (Any address in the block) OR [(NOT (mask)].
Example:
Previous example repeated using the mask. The mask in dotted-decimal notation is
256.256.256.224. The AND, OR, and NOT operations can be applied to individual bytes
using calculators and applets at the book website.

Network Address
• The first address, the network address, is important because it is used in routing a packet
to its destination network.
• When a packet arrives at the router from any source host, the router needs to know to which
network the packet should be sent: from which interface the packet should be sent out.
• After the network address has been found, the router consults its forwarding table to find
the corresponding interface from which the packet should be sent out.
• The network address is actually the identifier of the network; each network is identified by
its network address..

Network address
Block Allocation
• The ultimate responsibility of block allocation is given to a global authority called the
Internet Corporation for Assigned Names and Numbers (ICANN).
• However, ICANN does not normally allocate addresses to individual Internet users. It
assigns a large block of addresses to an.
• For the proper operation of the CIDR, two restrictions need to be applied to the allocated
block.
1. The number of requested addresses, N, needs to be a power of 2. The reason is that N =
232 − n or n = 32 − log2N. If N is not a power of 2, we cannot have an integer value for n.
2. The requested block needs to be allocated where there is an adequate number of
contiguous addresses available in the address space. The first address needs to be the prefix
followed by (32 − n) number of 0s. The decimal value of the first address is then
first address = (prefix in decimal) × 232 − n = (prefix in decimal) × N.
Example:
An ISP has requested a block of 1000 addresses. Since 1000 is not a power of 2, 1024
addresses are granted. The prefix length is calculated as n = 32 − log21024 = 22. An
available block, 18.14.12.0/22, is granted to the ISP.

Subnetting
• More levels of hierarchy can be created using subnetting.
• An organization (or an ISP) that is granted a range of addresses may divide the range into
several subranges and assign each subrange to a subnetwork (or subnet).
Designing Subnets
• We assume the total number of addresses granted to the organization is N, the prefix length
is n, the assigned number of addresses to each subnetwork is Nsub, and the prefix length for
each subnetwork is nsub.
• Then the following steps need to be carefully followed to guarantee the proper operation
of the subnetworks.
1. The number of addresses in each subnetwork should be a power of 2.
2. The prefix length for each subnetwork should be found using the following formula:
nsub = 32 − log2Nsub
3. The starting address in each subnetwork should be divisible by the number of addresses
in that subnetwork.
Address Aggregation
• One of the advantages of the CIDR strategy is address aggregation (sometimes called
address summarization or route summarization).
• When blocks of addresses are combined to create a larger block, routing can be done based
on the prefix of the larger block.
• ICANN assigns a large block of addresses to an ISP.
• Each ISP in turn divides its assigned block into smaller subblocks and grants the subblocks
to its customers.
Special Addresses
• The five special addresses that are used for special purposes: this-host address, limited-
broadcast address, loopback address, private addresses, and multicast addresses.
This-host Address
• The only address in the block 0.0.0.0/32 is called the this-host address.
• It is used whenever a host needs to send an IP datagram but it does not know its own address
to use as the source address.
Limited-broadcast Address
• The only address in the block 255.255.255.255/32 is called the limited-broadcast address.
• It is used whenever a router or a host needs to send a datagram to all devices in a network.
Loopback Address
• The block 127.0.0.0/8 is called the loopback address.
• A packet with one of the addresses in this block as the destination address never leaves the
host
• We can test the programs using the same host to see if they work before running them on
different computers.
Private Addresses
• Four blocks are assigned as private addresses: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16,
and 169.254.0.0/16.
Multicast Addresses
• The block 224.0.0.0/4 is reserved for multicast addresses.

Dynamic Host Configuration Protocol (DHCP)


• Address assignment in an organization can be done automatically using the Dynamic Host
Configuration Protocol (DHCP).
• DHCP is often called a plugand-play protocol.
• A network manager can configure DHCP to assign permanent IP addresses to the host and
routers.
• DHCP can also be configured to provide temporary, on demand, IP addresses to hosts.
• Four pieces of information are normally needed: the computer address, the prefix, the
address of a router, and the IP address of a name server. DHCP can be used to provide these
pieces of information to the host.
DHCP Message Format
• DHCP is a client-server protocol in which the client sends a request message and the server
returns a response message.
• Below figure shows the DHCP message format.

DHCP message format


• The 64-byte option field has a dual purpose.
• It can carry either additional information or some specific vendor information.
• The server uses a number, called a magic cookie, in the format of an IP address with the
value of 99.130.83.99.
• When the client finishes reading the message, it looks for this magic cookie.
• If present, the next 60 bytes areoptions.
• An option is composed of three fields: a 1-byte tag field, a 1-byte length field, and a
variable-length value field.
• There are several tag fields that are mostly used by vendors.
• If the tag field is 53, the value field defines one of the 8 message types shown in below.
Option format

DHCP Operation
1. The joining host creates a DHCPDISCOVER message in which only the transaction- ID field is
set to a random number. No other field can be set because the host has no knowledge with which
to do so. This message is encapsulated in a UDP user datagram with the source port set to 68 and
the destination port set to 67. The user datagram is encapsulated in an IP datagram with the source
address set to 0.0.0.0 (“this host”) and the destination address set to 255.255.255.255 (broadcast
address). The reason is that the joining host knows neither its own address nor the server address.
2. The DHCP server or servers (if more than one) responds with a DHCPOFFER message in which
the your address field defines the offered IP address for the joining host and the server address
field includes the IP address of the server. The message also includes the lease time for which the
host can keep the IP address. This message is encapsulated in a user datagram with the same port
numbers, but in the reverse order. The user datagram in turn is encapsulated in a datagram with the
server address as the source IP address, but the destination address is a broadcast address, in which
the server allows other DHCP servers to receive the offer and give a better offer if they can.
3. The joining host receives one or more offers and selects the best of them. The joining host then
sends a DHCPREQUEST message to the server that has given the best offer. The fields with known
value are set. The message is encapsulated in a user datagram with port numbers as the first
message. The user datagram is encapsulated in an IP datagram with the source address set to the
new client address, but the destination address still is set to the broadcast address to let the other
servers know that their offer was not accepted.
4. Finally, the selected server responds with a DHCPACK message to the client if the offered IP
address is valid. If the server cannot keep its offer (for example, if the address is offered to another
host in between), the server sends a DHCPNACK message and the client needs to repeat the
process. This message is also broadcast to let other servers know that the request is accepted or
rejected.
Two Well-Known Ports
• The DHCP uses two well-known ports (68 and 67) instead of one well-known and one
ephemeral.
• The reason for choosing the well-known port 68 instead of an ephemeral port for the client
is that the response from the server to the client is broadcast.
• Now assume that a DHCP client and a DAYTIME client, for example, are both waiting to
receive a response from their corresponding server and both have accidentally used the
same temporary port number (56017, for example).
• Both hosts receive the response message from the DHCP server and deliver the message to
their clients. The DHCP client processes the message; the DAYTIME client is totally
confused with a strange message received.
• Using a well-known port number prevents this problem from happening. The response
message from the DHCP server is not delivered to theDAYTIME client, which is running
on the port number 56017, not 68.
• The temporary port numbers are selected from a different range than the well-known port
numbers.
Using FTP
• The server does not send all of the information that a client may need for joining the
network.
• In the DHCPACK message, the server defines the pathname of a file in which the client
can find complete information such as the address of the DNS server.
• The client can then use a file transfer protocol to obtain the rest of the needed information.

Error Control
• DHCP uses the service of UDP, which is not reliable.
• To provide error control, DHCP uses two strategies.
• First, DHCP requires that UDP use the checksum the use of the checksum in UDP is
optional.
• Second, the DHCP client uses timers and a retransmission policy if it does not receive the
DHCP reply to a request.
• However, to prevent a traffic jam when several hosts need to retransmit a request (for
example, after a power failure), DHCP forces the client to use a random number to set its
timers.

Transition States
• To provide dynamic address allocation, the DHCP client acts as a state machine that
performs transitions from one state to another depending on the messages it receives or
sends.
• The below diagram shows the transition diagram with the main states.

FSM for the DHCP client


• When the DHCP client first starts, it is in the INIT state (initializing state).
• The client broadcasts a discover message.
• When it receives an offer, the client goes to the SELECTING state.
• While it is there, it may receive more offers.
• After it selects an offer, it sends a request message and goes to the REQUESTING state.
• If an ACK arrives while the client is in this state, it goes to the BOUND state and uses the
IP address.
• When the lease is 50 percent expired, the client tries to renew it by moving to the
RENEWING state.
• If the server renews the lease, the client moves to the BOUND state again.
• If the lease is not renewed and the lease time is 75 percent expired, the client moves to the
REBINDING state.
• If the server agrees with the lease (ACK message arrives), the client moves to the BOUND
state and continues using the IP address; otherwise, the client moves to the INIT state and
requests another IP address.
Network Address Resolution (NAT)
• Assume that in a small business with 20 computers the maximum number of computers
that access the Internet simultaneously is only 4.
• This small business can use the TCP/IP protocol for both internal and universal
communication. The business can use 20 (or 25) addresses from the private block addresses
for internal communication; five addresses for universal communication can be assigned
by the ISP.
• NAT is a technology that can provide the mapping between the private and universal
addresses, and at the same time support virtual private networks.
• The technology allows a site to use a set of private addresses for internal communication
and a set of global Internet addresses (at least one) for communication with the rest of the
world.
• The site must have only one connection to the global Internet through a NAT-capable router
that runs NAT software.
Address Translation
• All of the outgoing packets go through the NAT router, which replaces the source address
in the packet with the global NAT address.
• All incoming packets also pass through the NAT router, which replaces the destination
address in the packet (the NAT router global address) with the appropriate private address.
Translation Table
• How does the NAT router know the destination address for a packet coming from the
Internet? There may be tens or hundreds of private IP addresses, each belonging to one
specific host. The problem is solved if the NAT router has a translation table.

Address translation
Using One IP Address
• A translation table has only two columns, the private address and the external address
(destination address of the packet).
• When the router translates the source address of the outgoing packet, it also makes note of
the destination address.
• When the response comes back from the destination, the router uses the source address of
the packet to find the private address of the packet.
• In this strategy, communication must always be initiated by the private network.
• The NAT mechanism described requires that the private network start the communication.
Using a Pool of IP Addresses
• The use of only one global address by the NAT router allows only one private-network host
to access a given external host.
• To remove this restriction, the NAT router can use a pool of global addresses.
• For example, instead of using only one global address (200.24.5.8), the NAT router can use
four addresses (200.24.5.8, 200.24.5.9, 200.24.5.10, and 200.24.5.11).
• In this case, four private-network hosts can communicate with the same external host at
the same time because each pair of addresses defines a separate connection.
• However, there are still some drawbacks. No more than four connections can be made to
the same destination.
• No private-network host can access two external server programs (e.g., HTTP and
TELNET) at the same time.
• And, likewise, two private-network hosts cannot access the same external server program
(e.g., HTTP or TELNET) at the same time.
Using Both IP Addresses and Port Addresses
• To allow a many-to-many relationship between private-network hosts and external server
programs, we need more information in the translation table.
• For example, suppose two hosts inside a private network with addresses 172.18.3.1 and
172.18.3.2 need to access the HTTP server on external host 25.8.3.2.
• If the translation table has five columns, instead of two, that include the source and
destination port addresses and the transport-layer protocol, the ambiguity is eliminated.
Internet Protocol (IP)

• The network layer in version 4 can be thought of as one main protocol and three auxiliary
ones.
• The Internet Protocol version 4 (IPv4), is responsible for packetizing, forwarding, and
delivery of a packet at the network layer.
• The Internet Control Message Protocol version 4 (ICMPv4) helps IPv4 to handle some
errors that may occur in the network-layer delivery.
• The Internet Group Management Protocol (IGMP) is used to help IPv4 in multicasting.
• The Address Resolution Protocol (ARP) is used to glue the network and data-link layers in
mapping network-layer addresses to link-layer addresses.

• IPv4 is an unreliable datagram protocol—a best-effort delivery service.


• IPv4 is also a connectionless protocol that uses the datagram approach.
• This means that each datagram is handled independently, and each datagram can follow a
different route to the destination.
• This implies that datagrams sent by the same source to the same destination could arrive
out of order.

Datagram Format
Packets used by the IP are called datagrams. A datagram is a variable-length packet consisting
of two parts: header and payload (data).
The header is 20 to 60 bytes in length and contains information essential to routing and
delivery.

Version Number: The 4-bit version number (VER) field defines the version of the IPv4
protocol, which, has the value of 4.
Header Length: The 4-bit header length (HLEN) field defines the total length of the datagram
header in 4-byte words. The IPv4 datagram has a variable-length header.
Service Type: In the original design of the IP header, this field was referred to as type of
service (TOS), which defined how the datagram should be handled.
Total Length: This 16-bit field defines the total length (header plus data) of the IP datagram
in bytes. There are occasions in which in the datagram padding has been added. This field is
used to determine the total length.
Identification, Flags, and Fragmentation Offset: These three fields are related to the
fragmentation of the IP datagram when the size of the datagram is larger than the underlying
network can carry.
Time-to-live: The time-to-live (TTL) field is used to control the maximum number of hops
(routers) visited by the datagram. Each router that processes the datagram decrements this
number by one. If this value, after being decremented, is zero, the router discards the datagram.

Protocol: A datagram, can carry a packet belonging to any transport-layer protocol such as
UDP or TCP. When the payload is encapsulated in a datagram, the corresponding protocol
number is inserted in this field; when the datagram arrives at the destination, the value of this
field helps to define to which protocol the payload should be delivered.

Header checksum: Errors in the IP header can be a disaster. For example, if the destination IP
address is corrupted, the packet can be delivered to the wrong host. If the protocol field is
corrupted, the payload may be delivered to the wrong protocol. Hence, IP adds a header
checksum field to check the header, but not the payload.
Source and Destination Addresses: These 32-bit source and destination address fields define
the IP address of the source and destination respectively.
Options: A datagram header can have up to 40 bytes of options. Options can be used for
network testing and debugging.
Payload: Payload, or data is the packet coming from other protocols that use the service of IP.

IPv6 PROTOCOL
• The change of the IPv6 address size requires the change in the IPv4 packet format.
• The following shows the changes implemented in the protocol in addition to changing
address size and format.
• Better header format. IPv6 uses a new header format in which options are separated from
the base header and inserted, when needed, between the base header and the data. This
simplifies and speeds up the routing process because most of the options do not need to be
checked by routers.
• New options. IPv6 has new options to allow for additional functionalities.
• Allowance for extension. IPv6 is designed to allow the extension of the protocol if
required by new technologies or applications.
• Support for resource allocation. In IPv6, the type-of-service field has been removed, but
two new fields, traffic class and flow label, have been added to enable the source to request
special handling of the packet. This mechanism can be used to support traffic such as real-
time audio and video.
• Support for more security. The encryption and authentication options in IPv6 provide
confidentiality and integrity of the packet.

IPv6 Packet Format


Each packet is composed of a base header followed by the payload. The base header occupies 40
bytes, whereas payload can be up to 65,535 bytes of information. The description of fields follows.

IPv6 datagram
• Version. The 4-bit version field defines the version number of the IP. For IPv6, the value
is 6.
• Traffic class. The 8-bit traffic class field is used to distinguish different payloads with
different delivery requirements. It replaces the type-of-service field in IPv4.
• Flow label. The flow label is a 20-bit field that is designed to provide special handling for
a particular flow of data. We will discuss this field later.
• Payload length. The 2-byte payload length field defines the length of the IP datagram
excluding the header. Note that IPv4 defines two fields related to the length: header length
and total length. In IPv6, the length of the base header is fixed (40 bytes); only the length
of the payload needs to be defined.
• Next header. The next header is an 8-bit field defining the type of the first extension header
(if present) or the type of the data that follows the base header in the datagram. This field
is similar to the protocol field in IPv4, but we talk more about it when we discuss the
payload.
• Hop limit. The 8-bit hop limit field serves the same purpose as the TTL field in IPv4.
• Source and destination addresses. The source address field is a 16-byte (128-bit) Internet
address that identifies the original source of the datagram. The destination address field is
a 16-byte (128-bit) Internet address that identifies the destination of the datagram.
• Payload. Compared to IPv4, the payload field in IPv6 has a different format and meaning.

Payload in an IPv6 datagram


• The payload in IPv6 means a combination of zero or more extension headers (options)
followed by the data from other protocols (UDP, TCP, and so on).
• In IPv6, options, which are part of the header in IPv4, are designed as extension headers.
• The payload can have as many extension headers as required by the situation.
• Each extension header has two mandatory fields, next header and the length, followed by
information related to the particular option.
• Note that each next header field value (code) defines the type of the next header (hop-by-
hop option, source routing option, . . .); the last next header field defines the protocol (UDP,
TCP, . . .) that is carried by the datagram.
Concept of Flow and Priority in IPv6
• The IP protocol was originally designed as a connectionless protocol. However, the
tendency is to use the IP protocol as a connection-oriented protocol.
• In version 6, the flow label has been directly added to the format of the IPv6 datagram to
allow us to use IPv6 as a connection-oriented protocol.
• To a router, a flow is a sequence of packets that share the same characteristics, such as
traveling the same path, using the same resources, having the same kind of security, and so
on.
• A router that supports the handling of flow labels has a flow label table.
• The table has an entry for each active flow label; each entry defines the services required
by the corresponding flow label.
• When the router receives a packet, it consults its flow label table to find the corresponding
entry for the flow label value defined in the packet.
• It then provides the packet with the services mentioned in the entry. The information in the
entries are provided by the hop-by-hop options or other protocols.
• A flow label can be used to support the transmission of real-time audio and video.
• Real-time audio or video, particularly in digital form, requires resources such as high
bandwidth, large buffers, long processing time, and so on.
• A process can make a reservation for these resources beforehand to guarantee that real-
time data will not be delayed due to a lack of resources.
• The use of real-time data and the reservation of these resources require other protocols such
as Real-Time Transport Protocol (RTP) and Resource Reservation Protocol (RSVP) in
addition to IPv6.
Fragmentation and Reassembly
• IPv6 datagrams can be fragmented only by the source, not by the routers; the reassembly
takes place at the destination.
• The fragmentation of packets at routers is not allowed to speed up the processing of packets
in the router.
• The fragmentation of a packet in a router needs a lot of processing.
• The packet needs to be fragmented, all fields related to the fragmentation need to be
recalculated.
• In IPv6, the source can check the size of the packet and make the decision to fragment the
packet or not.
• When a router receives the packet, it can check the size of the packet and drop it if the size
is larger than allowed by the MTU of the network ahead.
• The router then sends a packet-too-big ICMPv6 error message to inform the source.
Extension Header
• An IPv6 packet is made of a base header and some extension headers.
• The length of the base header is fixed at 40 bytes. However, to give more functionality to
the IP datagram, the base header can be followed by up to six extension headers.
• Many of these headers are options in IPv4. Six types of extension headers have been
defined.
• These are hop-by-hop option, source routing, fragmentation, authentication, encrypted
security payload, and destination option

Extension header types


Hop-by-Hop Option
• The hop-by-hop option is used when the source needs to pass information to all routers
visited by the datagram.
• For example, perhaps routers must be informed about certain management, debugging, or
control functions.
• So far, only three hopby- hop options have been defined: Pad1, PadN, and jumbo payload.
1. Pad1. This option is 1 byte long and is designed for alignment purposes. Some options
need to start at a specific bit of the 32-bit word. If an option falls short of this
requirement by exactly one byte, Pad1 is added.
2. PadN. PadN is similar in concept to Pad1. The difference is that PadN is used when 2
or more bytes are needed for alignment.
3. Jumbo payload. Recall that the length of the payload in the IP datagram can be a
maximum of 65,535 bytes. However, if for any reason a longer payload is required, we
can use the jumbo payload option to define this longer length.
Destination Option
• The destination option is used when the source needs to pass information to the destination
only.
• Intermediate routers are not permitted access to this information. The format of the
destination option is the same as the hop-by-hop option. So far, only the Pad1 and PadN
options have been defined.
Source Routing
• The source routing extension header combines the concepts of the strict source route and
the loose source route options of IPv4.
Fragmentation
• In IPv4, the source or a router is required to fragment if the size of the datagram is larger
than the MTU of the network over which the datagram travels.
• In IPv6, only the original source can fragment. A source must use a Path MTU Discovery
technique to find the smallest MTU supported by any network on the path.
• The source then fragments using this knowledge. If the source does not use a Path MTU
Discovery technique, it fragments the datagram to a size of 1280 bytes or smaller.
• This is the minimum size of MTU required for each network connected to the Internet.
Authentication
• The authentication extension header has a dual purpose: it validates the message sender
and ensures the integrity of data.
• The former is needed so the receiver can be sure that a message is from the genuine sender
and not from an imposter.
• The latter is needed to check that the data is not altered in transition by some hacker.
Encrypted Security Payload
• The encrypted security payload (ESP) is an extension that provides confidentiality and
guards against eavesdropping.
Comparison of Options between IPv4 and IPv6
• The following shows a quick comparison between the options used in IPv4 and the options
used in IPv6 (as extension headers).
• The no-operation and end-of-option options in IPv4 are replaced by Pad1 and PadN options
in IPv6.
• The record route option is not implemented in IPv6 because it was not used.
• The timestamp option is not implemented because it was not used.
• The source route option is called the source route extension header in IPv6.
• The fragmentation fields in the base header section of IPv4 have moved to the
fragmentation extension header in IPv6.
• The authentication extension header is new in IPv6.
• The encrypted security payload extension header is new in IPv6.
Unicast routing
• In unicast routing, a packet is routed, hop by hop, from its source to its destination by the
help of forwarding tables.
• The source host needs no forwarding table because it delivers its packet to the default router
in its local network.
• The destination host needs no forwarding table either because it receives the packet from
its default router in its local network.
• This means that only the routers that glue together the networks in the internet need
forwarding tables.
• There are several routes that a packet can travel from the source to the destination; what
must be determined is which route the packet should take.
An Internet as a Graph
• To find the best route, an internet can be modeled as a graph.
• A graph is a set of nodes and edges (lines) that connect the nodes. We can think of each
router as a node and link between a pair of routers as an edge.
• An internet is, in fact, modeled as a weighted graph, in which each edge is associated with
a cost.

Least-Cost Routing
• When an internet is modeled as a weighted graph, one of the ways to interpret the best
route from the source router to the destination router is to find the least cost between the
two.
• The best route between A and E is A-B-E, with the cost of 6.
• This means that each router needs to find the least-cost route between itself and all the
other routers to be able to route a packet using this criteria.

An internet and its graphical representation


Least-Cost Trees
• If there are N routers in an internet, there are (N − 1) least-cost paths from each router to
any other router.
• This means we need N × (N − 1) least-cost paths for the whole internet.
• A better way to see all of these paths is to combine them in a least-cost tree.
• A least-cost tree is a tree with the source router in which the path between the root and
any other node is the shortest.
• In this way, we can have only one shortest-path tree for each node; we have N least-cost
trees for the whole internet.

Least-cost trees for nodes in the internet


The least-cost trees for a weighted graph can have several properties if they are created using
consistent criteria.
1. The least-cost route from X to Y in X’s tree is the inverse of the least-cost route from Y to X in
Y’s tree.
2. Instead of travelling from X to Z using X’s tree, we can travel from X to Y using X’s tree and
continue from Y to Z using Y’s tree.

ROUTING ALGORITHMS
Distance-Vector Routing
• In distance-vector routing, each node creates is its own least-cost tree with the
information it has about its immediate neighbors.
• The incomplete trees are exchanged between immediate neighbors to make the trees more
and more complete and to represent the whole internet.
• A router continuously tells all of its neighbors what it knows about the whole internet
(although the knowledge can be incomplete).
Bellman-Ford Equation
• The heart of distance-vector routing is the famous Bellman-Ford equation.
• This equation is used to find the least cost (shortest distance) between a source node, x, and
a destination node, y, through some intermediary nodes (a, b, c, . . .).
• The following shows the general case in which Dij is the shortest distance and cij is the cost
between nodes i and j.
Dxy = min{(cxa + Day), (cxb + Dby), (cxc + Dcy), …}
• The equation can be represented as shown below:
Dxy = min{Dxy, (cxz + Dzy)}

Graphical idea behind Bellman-Ford equation


Distance Vectors
• A least-cost tree is a combination of least-cost paths from the root of the tree to all
destinations.
• Distance-vector routing creates a distance vector, a one-dimensional array to represent the
tree. and the corresponding distance vector.
• Note that the name of the distance vector defines the root, the indexes define the
destinations, and the value of each cell defines the least cost from the root to the destination.
• Each node in an internet, creates a very rudimentary distance vector with the minimum
information the node can obtain from its neighborhood.
• Each node discovers the identity of the immediate neighbors and the distance between itself
and each neighbor.
• It then makes a simple distance vector by inserting the discovered distances in the
corresponding cells and leaves the value of other cells as infinity.
The distance vector corresponding to a tree

The first distance vector for an internet


• These rudimentary vectors cannot help the internet to effectively forward a packet.
• After each node has created its vector, it sends a copy of the vector to all its immediate
neighbors.
• After a node receives a distance vector from a neighbor, it updates its distance vector using
the Bellman-Ford equation.
• However, we need to understand that we need to update, not only one least cost, but N of
them in which N is the number of the nodes in the internet.
• If we are using a program, we can do this using a loop; if we are showing the concept on
paper, we can show the whole vector instead of the N separate equations.
• The figure shows two asynchronous events, happening one after another with some time
in between.
• In the first event, node A has sent its vector to node B. Node B updates its vector using the
cost cBA = 2.
• In the second event, node E has sent its vector to node B.
• Node B updates its vector using the cost cEA = 4. After the first event, node B has one
improvement in its vector: its least cost to node D has changed from infinity to 5 (via node
A).
• After the second event, node B has one more improvement in its vector; its least cost to
node F has changed from infinity to 6 (via node E).
• Exchanging vectors eventually stabilizes the system and allows all nodes to find the
ultimate least cost between themselves and any other node.
• After updating a node, it immediately sends its updated vector to all neighbors.

Updating distance vectors

Distance-Vector Routing Algorithm


Distance_Vector_Routing ( )
{
// Initialize (create initial vectors for the node)
D[myself] = 0
for (y = 1 to N)
{
if (y is a neighbor)
D[y] = c[myself][y]
else
D[y] = ∞
}
send vector {D[1], D[2], …, D[N]} to all neighbors
// Update (improve the vector with the vector received from a neighbor)
repeat (forever)
{
wait (for a vector Dw from a neighbor w or any change in the link)
for (y = 1 to N)
{
D[y] = min [D[y], (c[myself][w] + Dw[y])]
}
if (any change in the vector)
send vector {D[1], D[2], …, D[N]} to all neighbors
}
}
Count to Infinity
• A problem with distance-vector routing is that any decrease in cost (good news) propagates
quickly, but any increase in cost (bad news) will propagate slowly.
• For a routing protocol to work properly, if a link is broken (cost becomes infinity), every
other router should be aware of it immediately, but in distance-vector routing, this takes
some time.
• The problem is referred to as count to infinity. It sometimes takes several updates before
the cost for a broken link is recorded as infinity by all routers.
Two-Node Loop
• One example of count to infinity is the two-node loop problem.
• To understand the problem, let us look at the scenario, a system with three nodes.
• At the beginning, both nodes A and B know how to reach node X.
• But suddenly, the link between A and X fails.
• Node A changes its table.
• If A can send its table to B immediately, everything is fine. However, the system becomes
unstable if B sends its forwarding table to A before receiving A’s forwarding table.
• Node A receives the update and, assuming that B has found a way to reach X, immediately
updates its forwarding table.
• Now A sends its new update to B. Now B thinks that something has been changed around
A and updates its forwarding table.
• The cost of reaching X increases gradually until it reaches infinity.
• At this moment, both A and B know that X cannot be reached. However, during this time
the system is not stable.
• Node A thinks that the route to X is via B; node B thinks that the route to X is via A.
• If A receives a packet destined for X, the packet goes to B and then comes back to A.
• Similarly, if B receives a packet destined for X, it goes to A and comes back to B. Packets
bounce between A and B, creating a two-node loop problem.
• A few solutions have been proposed for instability of this kind.

Split Horizon
• One solution to instability is called split horizon. In this strategy, instead of flooding the
table through each interface, each node sends only part of its table through each interface.
• If, according to its table, node B thinks that the optimum route to reach X is via A, it does
not need to advertise this piece of information to A; the information has come from A (A
already knows).
• Taking information from node A, modifying it, and sending it back to node A is what
creates the confusion.
• In our scenario, node B eliminates the last line of its forwarding table before it sends it to
A.
• In this case, node A keeps the value of infinity as the distance to X. Later, when node A
sends its forwarding table to B, node B also corrects its forwarding table.
• The system becomes stable after the first update: both node A and node B know that X is
not reachable.
Poison Reverse
• Using the split-horizon strategy has one drawback. Normally, the corresponding protocol
uses a timer, and if there is no news about a route, the node deletes the route from its table.
• When node B in the previous scenario eliminates the route to X from its advertisement to
A, node A cannot guess whether this is due to the split-horizon strategy (the source of
information was A) or because B has not received any news about X recently.
• In the poison reverse strategy B can still advertise the value for X, but if the source of
information is A, it can replace the distance with infinity as a warning: “Do not use this
value; what I know about this route comes from you.”
Three-Node Instability
• The two-node instability can be avoided using split horizon combined with poison reverse.
However, if the instability is between three nodes, stability cannot be guaranteed.

Link-State Routing
• In this algorithm the cost associated with an edge defines the state of the link.
• Links with lower costs are preferred to links with higher costs; if the cost of a link is infinity,
it means that the link does not exist or has been broken.

Link-State Database (LSDB)


• To create a least-cost tree with this method, each node needs to have a complete map of the
network. The collection of states for all links is called the link-state database (LSDB).
• There is only one LSDB for the whole internet; each node needs to have a duplicate of it
to be able to create the least-cost tree.
• The LSDB can be represented as a two-dimensional array(matrix) in which the value of
each cell defines the cost of the corresponding link.

Example of a link-state database


• This can be done by a process called flooding. Each node can send some greeting messages
to all its immediate neighbors, to collect the identity of the node and the cost of the link.
• The combination of these two pieces of information is called the LS packet (LSP).
• It then sends a copy of it out of each interface except the one from which the packet arrived.
• After receiving all new LSPs, each node creates the comprehensive LSDB.
• This LSDB is the same for each node and shows the whole map of the internet.

LSPs created and sent out by each node to build LSDB


• We can compare the link-state routing algorithm with the distance-vector routing
algorithm.
• In the distance-vector routing algorithm, each router tells its neighbors what it knows about
the whole internet; in the link-state routing algorithm, each router tells the whole internet
what it knows about its neighbors.

Formation of Least-Cost Trees


To create a least-cost tree for itself, each node needs to run the Dijkstra Algorithm. This iterative
algorithm uses the following steps:
1. The node chooses itself as the root of the tree, creating a tree with a single node, and sets
the total cost of each node based on the information in the LSDB.
2. The node selects one node, among all nodes not in the tree, which is closest to the root, and
adds this to the tree. After this node is added to the tree, the cost of all other nodes not in
the tree needs to be updated because the paths may have been changed.
3. The node repeats step 2 until all nodes are added to the tree.
Dijkstra’s Algorithm
Dijkstra’s Algorithm ( )
{
// Initialization
Tree = {root}
for (y = 1 to N)
{
if (y is the root)
D[y] = 0 // D[y] is shortest distance from root to node y
else if (y is a neighbor)
D[y] = c[root][y]
else
D[y] = ∞
}
Repeat
{
find a node w, with D[w] minimum among all nodes not in the Tree
Tree = Tree ∪ {w}
// Update distances for all neighbors of w
for (every node x, which is a neighbor of w and not in the Tree)
{
D[x] = min{D[x], (D[w] + c[w][x])}
}
} until (all nodes included in the Tree)
}
Below figure, shows the formation of the least-cost tree for the given graph.
Least-cost tree

Path-Vector Routing
• Path-Vector Routing allows a sender to apply specific policies to the route a packet.
• Aside from safety and security, there are occasions, in which the goal of routing is merely
reachability: to allow the packet to reach its destination more efficiently without assigning
costs to the route.
• The source can control the path. It is mostly designed to route a packet between ISPs.
Spanning Trees
• In path-vector routing, the path from a source to all destinations is also determined by the
best spanning tree.
• The best spanning tree, is the tree determined by the source when it imposes its own policy.
• One of the common policies uses the minimum number of nodes to be visited.
• Another common policy is to avoid some nodes as the middle node in a route.
• In the below example, the policy imposed by all sources is to use the minimum number of
nodes to reach a destination.
• The spanning tree selected by A and E is such that the communication does not pass through
D as a middle node. Similarly, the spanning tree selected by B is such that the
communication does not pass through C as a middle node.

Spanning trees in path-vector routing


Creation of Spanning Trees
• Path-vector routing, like distance-vector routing, is an asynchronous and distributed
routing algorithm.
• When a node is booted, it creates a path vector based on the information it can obtain about
its immediate neighbor.
• After the creation of the initial path vector, sends it to all its immediate neighbors.
• When a node receives a path vector from a neighbor, updates its path vector.
• We can define this equation as
Path(x, y) = best {Path(x, y), [(x + Path(v, y)]} for all v’s in the internet.
Path vectors made at booting time
• The policy is defined by selecting the best of multiple paths.
• If Path (v, y) includes x, that path is discarded to avoid a loop in the path. In other words,
x does not want to visit itself when it selects a path to y.
• The path vector of node C after two events. In the first event, node C receives a copy of
B’s vector, which improves its vector: now it knows how to reach node A.
• In the second event, node C receives a copy of D’s vector, which does not change its vector.
As a matter of fact the vector for node C after the first event is stabilized and serves as its
forwarding table.

Updating path vectors


Path-Vector Algorithm
• Based on the initialization process and the equation used in updating each forwarding table
after receiving path vectors from neighbors, A simplified version of the path vector
algorithm given as:
Path_Vector_Routing ( )
{
// Initialization
for (y = 1 to N)
{
if (y is myself)
Path[y] = myself
else if (y is a neighbor)
Path[y] = myself + neighbor node
else
Path[y] = empty
}
Send vector {Path[1], Path[2], …, Path[y]} to all neighbors
repeat (forever)
{
wait (for a vector Pathw from a neighbor w)
for (y = 1 to N)
{
if (Pathw includes myself)
discard the path // Avoid any loop
else
Path[y] = best {Path[y], (myself + Pathw[y])}
}
If (there is a change in the vector)
Send vector {Path[1], Path[2], …, Path[y]} to all neighbors
}
}

UNICAST ROUTING PROTOCOLS


• Routing Information Protocol (RIP), based on the distance-vector algorithm
• Open Shortest Path First (OSPF), based on the link-state algorithm
• Border Gateway Protocol (BGP), based on the path-vector algorithm

Internet Structure
• The Internet has changed from a tree-like structure, with a single backbone, to a multi-
backbone structure run by different private corporations today.

Internet structure
• There are several backbones run by private communication companies that provide global
connectivity.
• These backbones are connected by some peering points that allow connectivity between
backbones.
• At a lower level, there are some provider networks that use the backbones for global
connectivity but provide services to Internet customers.
• Finally, there are some customer networks that use the services provided by the provider
networks.
• Any of these three entities (backbone, provider network, or customer network) can be
called an Internet Service Provider or ISP.

Hierarchical Routing
• The Internet today is made of a huge number of networks and routers that connect them.
• Scalability problem means that the size of the forwarding tables becomes huge, searching
for a destination in a forwarding table becomes time-consuming, and updating creates a
huge amount of traffic.
• The administrative issue is related to the Internet structure, as the administrator needs to
have control in its system.
• The organization must be able to use as many subnets and routers as it needs, may desire
that the routers be from a particular manufacturer, may wish to run a specific routing
algorithm to meet the needs of the organization, and may want to impose some policy on
the traffic passing through its ISP.
• Hierarchical routing means considering each ISP as an autonomous system (AS).
• Each AS can run a routing protocol that meets its needs, but the global Internet runs a global
protocol to glue all ASs together.
• The routing protocol run in each AS is referred to as intra-AS routing protocol, intradomain
routing protocol, or interior gateway protocol (IGP); the global routing protocol is referred
to as inter-AS routing protocol, interdomain routing protocol, or exterior gateway protocol
(EGP).
• The two common intradomain routing protocols are RIP and OSPF; the only interdomain
routing protocol is BGP.

Autonomous Systems
• We may have small, medium-size, and large ASs, each AS is given an autonomous number
(ASN) by the ICANN.
• Each ASN is a 16-bit unsigned integer that uniquely defines an AS.
• The autonomous systems, are not categorized according to their size; they are categorized
according to the way they are connected to other ASs.
• We have stub ASs, multihomed ASs, and transient ASs. The type, affects the operation of
the interdomain routing protocol in relation to that AS.
• Stub AS. A stub AS has only one connection to another AS. The data traffic can be either
initiated or terminated in a stub AS; the data cannot pass through it. A good example of a
stub AS is the customer network, which is either the source or the sink of data.
• Multihomed AS. A multihomed AS can have more than one connection to other ASs, but
it does not allow data traffic to pass through it. A good example of such an AS is some of
the customer ASs that may use the services of more than one provider network, but their
policy does not allow data to be passed through them.
• Transient AS. A transient AS is connected to more than one other AS and also allows the
traffic to pass through. The provider networks and the backbone are good examples of
transient ASs.

Routing Information Protocol (RIP)


• The Routing Information Protocol (RIP) is one of the most widely used intradomain routing
protocols based on the distance-vector routing algorithm.
• RIP was started as part of the Xerox Network System (XNS), but it was the Berkeley
Software Distribution (BSD) version of UNIX that helped make the use of RIP widespread.

Hop Count
• A router in this protocol basically implements the distance-vector routing algorithm.
• However, the algorithm has been modified as described below.
• First, since a router in an AS needs to know how to forward a packet to different
networks(subnets) in an AS, RIP routers advertise the cost of reaching different networks
instead of reaching other nodes in a theoretical graph.
• Second, to make the implementation of the cost simpler, the cost is defined as the number
of hops, which means the number of networks (subnets) a packet needs to travel through
from the source router to the final destination host.
• In RIP, the maximum cost of a path can be 15. For this reason, RIP can be used only in
autonomous systems in which the diameter of the AS is not more than 15 hops.

Hop counts in RIP


Forwarding Tables
• A forwarding table in RIP is a three-column table in which the first column is the address
of the destination network, the second column is the address of the next router to which the
packet should be forwarded, and the third column is the cost (the number of hops) to reach
the destination network.

Forwarding tables
• The forwarding table gives the information about the whole least-cost tree based on the
second property of these trees.
• For example, R1 defines that the next router for the path to N4 is R2; R2 defines that the
next router to N4 is R3; R3 defines that there is no next router for this path.
• The tree is then R1 → R2 → R3 → N4.

RIP Implementation
• RIP is implemented as a process that uses the service of UDP on the well-known port
number 520.
• In BSD, RIP is a daemon process (a process running in the background), named routed
(abbreviation for route daemon and pronounced route-dee). This means that, although RIP
is a routing protocol to help IP route its datagrams through the AS, the RIP messages are
encapsulated inside UDP user datagrams, which in turn are encapsulated inside IP
datagrams. In other words, RIP runs at the application layer, but creates forwarding tables
for IP at the network later.
• RIP has gone through two versions: RIP-1 and RIP-2. The second version is backward
compatible with the first section; it allows the use of more information in the RIP messages
that were set to 0 in the first version.

RIP Messages
• Two RIP processes, a client and a server, need to exchange messages. RIP-2 defines the
format of the message.
• Part of the message, entry, can be repeated as needed in a message.
RIP message format
• RIP has two types of messages: request and response.
• A request message is sent by a router that has just come up or by a router that has some
time-out entries. A request message can ask about specific entries or all entries.
• A response (or update) message can be either solicited or unsolicited.
• A solicited response message is sent only in answer to a request message.
• An unsolicited response message, on the other hand, is sent periodically, every 30 seconds
or when there is a change in the forwarding table.

RIP Algorithm
RIP implements the same algorithm as the distance-vector routing algorithm, however, some
changes are:
Instead of sending only distance vectors, a router needs to send the whole contents of its forwarding
table in a response message.
The receiver adds one hop to each cost and changes the next router field to the address of the
sending router. The received router selects the old routes as the new ones except in the following
three cases:
1. If the received route does not exist in the old forwarding table, it should be added to the
route.
2. If the cost of the received route is lower than the cost of the old one, the received route
should be selected as the new one.
3. If the cost of the received route is higher than the cost of the old one, but the value of the
next router is the same in both routes, the received route should be selected as the new one.
This is the case where the route was actually advertised by the same router in the past, but
now the situation has been changed. The new forwarding table needs to be sorted according
to the destination route (mostly using the longest prefix first).
Example of an autonomous system using RIP
Timers in RIP
• RIP uses three timers to support its operation. The periodic timer controls the advertising
of regular update messages.
• Each router has one periodic timer that is randomly set to a number between 25 and 35
seconds .
• The timer counts down; when zero is reached, the update message is sent, and the timer is
randomly set once again.
• The expiration timer governs the validity of a route. When a router receives update
information for a route, the expiration timer is set to 180 seconds for that particular route.
• The garbage collection timer is used to purge a route from the forwarding table.
• When the information about a route becomes invalid, the router does not immediately purge
that route from its table. Instead, it continues to advertise the route with a metric value of
16. At the same time, a garbage collection timer is set to 120 seconds for that route.
• When the count reaches zero, the route is purged from the table. This timer allows
neighbors to become aware of the invalidity of a route prior to purging.

Performance
• Update Messages. The update messages in RIP have a very simple format and are sent only
to neighbors. They do not normally create traffic because the routers try to avoid sending
them at the same time.
• Convergence of Forwarding Tables. RIP uses the distance-vector algorithm, which can
converge slowly if the domain is large, but, since RIP allows only 15 hops in a domain,
there is normally no problem in convergence. The only problems that may slow down
convergence are count-to-infinity and loops created in the domain; use of poison-reverse
and split-horizon strategies added to the RIP extension may alleviate the situation.
• Robustness. The calculation of the forwarding table depends on information received
from immediate neighbors, which in turn receive their information from their own
neighbors. If there is a failure or corruption in one router, the problem will be propagated
to all routers and the forwarding in each router will be affected.

Open Shortest Path First (OSPF)


Open Shortest Path First (OSPF) is also an intradomain routing protocol like RIP, but it is based
on the link-state routing protocol. OSPF is an open protocol, which means that the specification is
a public document.
Metric
• Each link can be assigned a weight based on the throughput, round-trip time, reliability,
and so on.
• An administration can also decide to use the hop count as the cost.
• Different service types (TOSs) can have different weights as the cost.

Metric in OSPF
Forwarding Tables
Each OSPF router can create a forwarding table after finding the shortest-path tree between itself
and the destination using Dijkstra’s algorithm. Comparing the forwarding tables for the OSPF and
RIP in the same AS, the only difference is the cost values instead of hop count.

Forwarding tables in OSPF


Areas
• OSPF was designed to be able to handle routing in a small or large autonomous system.
However, the formation of shortest-path trees in OSPF requires that all routers flood the
whole AS with their LSPs to create the global LSDB.
• This creates a huge volume of traffic in a large AS. To prevent this, the AS needs to be
divided into small sections called areas.
• Each area acts as a small independent domain for flooding LSPs. In other words, OSPF
uses another level of hierarchy in routing: the first level is the autonomous system, the
second is the area.
• One of the areas in the AS is designated as the backbone area, responsible for gluing the
areas together.
• The routers in the backbone area are responsible for passing the information collected by
each area to all other areas.
• In this way, a router in an area can receive all LSPs generated in other areas. For the purpose
of communication, each area has an area identification. The area identification of the
backbone is zero.

Areas in an autonomous system

Link-State Advertisement
We can have five types of link-state advertisements: router link, network link, summary link to
network, summary link to AS border router, and external link. Figure 20.22 shows these five
advertisements and their uses.
Five different LSPs
Router link
• A router link advertises the existence of a router as a node.
• This type of advertisement can define one or more types of links that connect the
advertising router to other entities.
• A transient link announces a link to a transient network, a network that is connected to the
rest of the networks by one or more routers.
• This type of advertisement should define the address of the transient network and the cost
of the link.
• A stub link advertises a link to a stub network, a network that is not a through network.
• A point-to-point link should define the address of the router at the end of the point-to-point
line and the cost to get there.

Network link
• A network link advertises the network as a node.
• Since a network cannot do announcements itself (it is a passive entity), one of the routers
is assigned as the designated router and does the advertising.
• In addition to the address of the designated router, this type of LSP announces the IP
address of all routers, but no cost is advertised because each router announces the cost to
the network when it sends a router link advertisement.

Summary link to network


• This is done by an area border router; it advertises the summary of links collected by the
backbone to an area or the summary of links collected by the area to the backbone.
• This type of information exchange is needed to glue the areas together.

Summary link to AS
• This is done by an AS router that advertises the summary links from other ASs to the
backbone area of the current AS, information which later can be disseminated to the areas
so that they will know about the networks in other ASs.
• The need for this type of information exchange is needed for inter-AS routing.

External link
• This is also done by an AS router to announce the existence of a single network outside the
AS to the backbone area to be disseminated into the areas.

OSPF Implementation
• OSPF is implemented as a program in the network layer, using the service of the IP for
propagation.
• An IP datagram that carries a message from OSPF sets the value of the protocol field to 89.
• ie.) The OSPF messages are encapsulated inside datagrams.
• OSPF has gone through two versions: version 1 and version 2. Most implementations use
version 2.

OSPF Messages
• OSPF is a very complex protocol; it uses five different types of messages.
• Below figure, shows the format of the OSPF common header (which is used in all
messages) and the link-state general header (which is used in some messages).
• Then the outlines of five message types used in OSPF.
• The hello message (type 1) is used by a router to introduce itself to the neighbors and
announce all neighbors that it already knows.
• The database description message (type 2) is normally sent in response to the hello message
to allow a newly joined router to acquire the full LSDB.
• The link state request message (type 3) is sent by a router that needs information about a
specific LS.
• The link-state update message (type 4) is the main OSPF message used for building the
LSDB.
• This message, has five different versions (router link, network link, summary link to
network, summary link to AS border router, and external link).
• The link-state acknowledgment message (type 5) is used to create reliability in OSPF; each
router that receives a link-state update message needs to acknowledge it.

OSPF message formats


Authentication
• Above figure shows, the OSPF common header has the provision for authentication of the
message sender.
• This prevents a malicious entity from sending OSPF messages to a router and causing the
router to become part of the routing system to which it actually does not belong.

OSPF Algorithm
• OSPF implements the link-state routing algorithm.
• However, some changes and augmentations need to be added to the algorithm: After each
router has created the shortest-path tree, the algorithm needs to use it to create the
corresponding routing algorithm.

Performance

1. Update Messages The link-state messages in OSPF have a somewhat complex format.
They also are flooded to the whole area. If the area is large, these messages may create
heavy traffic and use a lot of bandwidth.
2. Convergence of Forwarding Tables When the flooding of LSPs is completed, each router
can create its own shortest-path tree and forwarding table; convergence is fairly quick.
However, each router needs to run Dijkstra’s algorithm, which may take some time.
3. Robustness The OSPF protocol is more robust than RIP because, after receiving the
completed LSDB, each router is independent and does not depend on other routers in the
area. Corruption or failure in one router does not affect other routers as seriously as in RIP.

Border Gateway Protocol Version 4 (BGP4)


• The Border Gateway Protocol version 4 (BGP4) is the only interdomain routing protocol
used in the Internet today. BGP4 is based on the path-vector algorithm.
• Below figure, shows an example of an internet with four autonomous systems. AS2, AS3,
and AS4 are stub autonomous systems; AS1 is a transient one. Data exchange between
AS2, AS3, and AS4 should pass through AS1.
• Each autonomous system uses one of the two common intradomain protocols, RIP or
OSPF. Each router in each AS knows how to reach a network that is in its own AS, but it
does not know how to reach a network in another AS.
• To enable each router to route a packet to any network in the internet, install a variation of
BGP4, called external BGP (eBGP), on each border router. Then install the second
variation of BGP, called internal BGP (iBGP), on all routers.
• This means that the border routers will be running three routing protocols (intradomain,
eBGP, and iBGP), but other routers are running two protocols (intradomain and iBGP).
A sample internet with four ASs

Operation of External BGP (eBGP)


• BGP is a kind of point-to-point protocol. When the software is installed on two routers,
they try to create a TCP connection using the well-known port 179.
• In other words, a pair of client and server processes continuously communicate with each
other to exchange messages. The two routers that run the BGP processes are called BGP
peers or BGP speakers.
• The eBGP variation of BGP allows two physically connected border routers in two
different ASs to form pairs of eBGP speakers and exchange messages.
• The routers in above figure form three pairs: R1-R5, R2-R6, and R4 R9. Each logical
connection in BGP parlance is referred to as a session.
• Message number 1 is sent by router R1 and tells router R5 that N1, N2, N3, and N4 can be
reached through router R1.
• Router R5 can now add these pieces of information at the end of its forwarding table.
• When R5 receives any packet destined for these four networks, it can use its forwarding
table and find that the next router is R1.
eBGP operation
There are two problems that need to be addressed:
1. Some border routers do not know how to route a packet destined for nonneighbor ASs.
2. None of the nonborder routers know how to route a packet for any networks in other ASs.
To address the above two problems, we need to allow all pairs of routers (border or nonborder) to
run the second variation of the BGP protocol, iBGP.
Operation of Internal BGP (iBGP)
• The iBGP protocol is similar to the eBGP protocol in that it uses the service of TCP on the
well-known port 179, but it creates a session between any possible pair of routers inside an
autonomous system.
• If an AS has only one router, there cannot be an iBGP session. If there are n routers in an
autonomous system, there should be [n × (n − 1) / 2] iBGP sessions in that autonomous
system to prevent loops in the system.
• In other words, each router needs to advertise its own reachability to the peer in the session
instead of flooding what it receives from another peer in another session. Below figure
shows the combination of eBGP and iBGP sessions in our internet.
Combination of eBGP and iBGP sessions in the internet
• Initially four messages are exchanged. The first message (numbered 1) is sent by R1
announcing that networks N8 and N9 are reachable through the path AS1-AS2, but the next
router is R1.
• This message is sent, through separate sessions, to R2, R3, and R4. Routers R2, R4, and
R6 do the same thing but send different messages to different destinations.
• R3, R7, and R8 create sessions with their peers, but they actually have no message to send.
• After R1 receives the update message from R2, it combines the reachability information
about AS3 with the reachability information it already knows about AS1 and sends a new
update message to R5.
• Now R5 knows how to reach networks in AS1 and AS3. The process continues when R1
receives the update message from R4.
• Each router combines the information received from eBGP and iBGP and creates a path
table after applying the criteria for finding the best path.
• Below figure shows the path tables for the routers.
Finalized BGP path tables

Injection of Information into Intradomain Routing


• The path tables collected and organized by BPG are injected into intradomain forwarding
tables (RIP or OSPF) for routing packets.
• In the case of a stub AS, the only area border router adds a default entry at the end of its
forwarding table and defines the next router to be the speaker router at the end of the eBGP
connection. For example, R5 in AS2 defines R1 as the default router for all networks other
than N8 and N9.
• In the case of a transient AS, the situation is more complicated. For example, R1 in AS1
needs to inject the whole contents of the path table for R1 into its intradomain forwarding
table.
• One issue to be resolved is the cost value. We know that RIP and OSPF use different
metrics. One solution, is to set the cost to the foreign networks at the same cost value as to
reach the first AS in the path. For example, the cost for R5 to reach all networks in other
ASs is the cost to reach N5.
Forwarding tables after injection from BGP

Address Aggregation
• The intradomain forwarding tables obtained with the help of the BGP4 protocols may
become huge in the case of the global Internet because many destination networks may be
included in a forwarding table.
• Fortunately, BGP4 uses the prefixes as destination identifiers and allows the aggregation
of these prefixes. For example, prefixes 14.18.20.0/26, 14.18.20.64/26, 14.18.20.128/26,
and 14.18.20.192/26, can be combined into 14.18.20.0/24 if all four subnets can be reached
through one path.

Path Attributes
• In both intradomain routing protocols (RIP or OSPF), a destination is normally associated
with two pieces of information: next hop and cost.
• In BGP these pieces are called path attributes. BGP allows a destination to be associated
with up to seven path attributes.
• Path attributes are divided into two broad categories: well-known and optional.
• A well-known attribute must be recognized by all routers; an optional attribute need not
be.
• A well-known attribute can be mandatory, which means that it must be present in any BGP
update message. All attributes are inserted after the corresponding destination prefix in an
update message. The format for an attribute is shown below.

Format of path attribute


The first byte in each attribute defines the four attribute flags. The next byte defines the type of
attributes assigned by ICANN. The attribute value length defines the length of the attribute value
field.
A brief description of the seven attributes:
ORIGIN (type 1):
This is a well-known mandatory attribute, which defines the source of the routing information.
This attribute can be defined by one of the three values: 1, 2, and 3. Value 1 means that the
information about the path has been taken from an intradomain protocol (RIP or OSPF). Value 2
means that the information comes from BGP. Value 3 means that it comes from an unknown source.
AS-PATH (type 2):
This is a well-known mandatory attribute, which defines the list of autonomous systems through
which the destination can be reached. The AS-PATH attribute, helps to prevent a loop. Whenever
an update message arrives at a router that lists the current AS as the path, the router drops that path.
The AS-PATH can also be used in route selection.
NEXT-HOP (type 3):
This is a well-known mandatory attribute, which defines the next router to which the data packet
should be forwarded. This attribute helps to inject path information collected through the
operations of eBGP and iBGP into the intradomain routing protocols such as RIP or OSPF.
MULT-EXIT-DISC (type 4):
The multiple-exit discriminator is an optional intransitive attribute. If a router has multiple paths
to the destination with different values related to these attributes, the one with the lowest value is
selected.
LOCAL-PREF (type 5):
The local preference attribute is a well-known discretionary attribute. It is normally set by the
administrator, based on the organization policy. The routes the administrator prefers are given a
higher local preference value.
ATOMIC-AGGREGATE (type 6):
This is a well-known discretionary attribute, which defines the destination prefix as not aggregate;
it only defines a single destination network. This attribute has no value field, which means the
value of the length field is zero.
AGGREGATOR (type 7):
This is an optional transitive attribute, which emphasizes that the destination prefix is an aggregate.
The attribute value gives the number of the last AS that did the aggregation followed by the IP
address of the router that did so.

Route Selection:

Flow diagram for route selection


• In the case where multiple routes are received to a destination, BGP needs to select one
among them. A route in BGP has some attributes attached to it and it may come from an
eBGP session or an iBGP session.
• Above figure shows the flow diagram as used by common implementations. The router
extracts the routes which meet the criteria in each step.
• If only one route is extracted, it is selected and the process stops; otherwise, the process
continues with the next step.
• The first choice is related to the LOCAL-PREF attribute, which reflects the policy imposed
by the administration on the route.

Messages
BGP uses four types of messages for communication between the BGP speakers across the ASs
and inside an AS: open, update, keepalive, and notification. All BGP packets share the same
common header.

BGP messages
1. Open Message: To create a neighborhood relationship, a router running BGP opens a TCP
connection with a neighbor and sends an open message.
2. Update Message: The update message is used by a router to withdraw destinations that
have been advertised previously, to announce a route to a new destination, or both. Note
that BGP can withdraw several destinations that were advertised before, but it can only
advertise one new destination in a single update message.
3. Keepalive Message: The BGP peers that are running exchange keepalive messages
regularly (before their hold time expires) to tell each other that they are alive.
4. Notification: A notification message is sent by a router whenever an error condition is
detected or a router wants to close the session.

Performance
BGP performance can be compared with RIP. BGP speakers exchange a lot of messages to create
forwarding tables, but BGP is free from loops and count-to-infinity. The same weakness of RIP
about propagation of failure and corruption also exists in BGP.

Multicast Link State (MOSPF)


• Multicast Open Shortest Path First (MOSPF) is the extension of the Open Shortest Path
First (OSPF) protocol, which is used in unicast routing.
• It also uses the source based tree approach to multicasting. In unicast link-state routing,
each router in the internet has a link-state database (LSDB) that can be used to create a
shortest-path tree.
• To extend unicasting to multicasting, each router needs to have another database, to show
which interface has an active member in a particular group.
Now a router goes through the following steps to forward a multicast packet received from source
S and to be sent to destination G (a group of recipients):
1. The router uses the Dijkstra algorithm to create a shortest-path tree with S as the root and
all destinations in the internet as the leaves. In this case, the root of the tree is the source of
the packet defined in the source address of the packet. The shortest-path tree created this
way depends on the specific source. For each source we need to create a different tree.
2. The router creates a shortest-path subtree with itself as the root of the subtree.
3. The shortest-path subtree is actually a broadcast subtree with the router as the root and all
networks as the leaves. The router now uses a strategy to prune the broadcast tree and to
change it to a multicast tree. The IGMP protocol is used to find the information at the leaf
level. MOSPF has added a new type of link state update packet that floods the membership
to all routers.
4. The router can now forward the received packet out of only those interfaces that correspond
to the branches of the multicast tree. A copy of the multicast packet reaches all networks
that have active members of the group and that it does not reach those networks that do
not.
Below figure, shows an example of using the steps to change a graph to a multicast tree.

Example of tree formation in MOSPF

You might also like