Computer Networks (Bcs502)
Computer Networks (Bcs502)
COMPUTER NETWORKS[BCS502]
MODULE-3
Module-3
• The Network layer in the TCP/IP protocol suite is responsible for the host-to-
host delivery of datagrams.
• It provides services to the transport layer and receives services from the data-
link layer.
• In this chapter, we introduce the general concepts and issues in the network
layer
Department of Computer Science & Engineering
Although the source and destination hosts are involved in all five layers of the
TCP/IP suite, the routers use three layers if they are routing packets only;
The transport and application layers are need for control purposes.
A router in the path is normally shown with two data-link layers and two
physical layers, because it receives a packet from one network and delivers it to
another network.
Packetizing
The first duty of the network layer is definitely packetizing: encapsulating the
payload (data received from upper layer) in a network-layer packet at the source
and decapsulating the payload from the network-layer packet at the destination.
The network layer is doing the service of a carrier such as the postal office, which
is responsible for delivery of packages from a sender to a receiver without
changing or using the contents.
The source host
‣ receives the payload from an upper-layer protocol, adds a header that contains
the source and destination addresses and some other information that is
required by the network-layer protocol and delivers the packet to the data-link
layer.
‣ is not allowed to change the content of the payload unless it is too large for
Department of CSE- Data Science
delivery and needs to be fragmented.
Department of Computer Science & Engineering
Routing
The network layer is responsible for routing the packet from its source to the
destination.
A physical network is a combination of networks (LANs and WANs) and routers
that connect them. This means that there is more than one route from the source to
the destination.
The network layer is responsible for finding the best one among these possible
routes.
The network layer needs to have some specific strategies for defining the best
route.
This is done by running some routing protocols to help the routers coordinate their
knowledge about the neighborhood and to come up with consistent tables to be
Department of CSE- Data Science
used when a packet arrives.
Department of Computer Science & Engineering
Forwarding
Forwarding can be defined as the action applied by each router when a packet
arrives at one of its interfaces.
The decision-making table a router normally uses for applying this action is called
the forwarding table (or routing table)
When a router receives a packet from one of its attached networks, it needs to
forward the packet to another attached network (in unicast routing) or to some
attached networks (in multicast routing).
To make this decision, the router uses a
Other Services
Error Control
Although error control also can be implemented in the network layer, the
designers of the network layer in the Internet ignored this issue for the data being
carried by the network layer. One reason for this decision is the fact that the packet
in the network layer may be fragmented at each router, which makes error
checking at this layer inefficient.
The designers of the network layer, however, have added a checksum field to the
datagram to control any corruption in the header, but not in the whole packet. This
checksum may prevent any changes or corruptions in the header of the packet.
Although the network layer in the Internet does not directly provide error control,
the Internet uses an auxiliary protocol, ICMP, that provides some kind of error
control if the packet is discarded or has some unknown information in the header.
Department of CSE- Data Science
Department of Computer Science & Engineering
Flow Control
The network layer in the Internet does not directly provide any flow control.
The packets are sent by the sender when they are ready, without any attention to
the readiness of the receiver.
Reasons for the lack of flow control in the design of the network layer
1. Since there is no error control in this layer, the job of the network layer at the
receiver is so simple that it may rarely be overwhelmed.
2. The upper layers that use the service of the network layer can implement
buffers to receive data from the network layer as they are ready and do not have
to consume the data as fast as it is received.
3. Flow control is provided for most of the upper-layer protocols that use the
services of the network layer, so another level of flow control makes the
network layer more complicated and the whole system less efficient.
Congestion Control
Congestion may occur if the number of packets sent by source computers is beyond the
capacity of the network or routers.
However, as more packets are dropped, the situation may become worse because, due to the
error control mechanism at the upper layers, the sender may send duplicates of the lost
packets.
If the congestion continues, sometimes a situation may reach a point where the system
collapses and no packets are delivered.
Quality of Service
As the Internet has allowed new applications such as multimedia communication (in
particular real-time communication of audio and video), the quality of service (QoS) of the
communication has become more and more important.
To keep the network layer untouched, these provisions are mostly implemented in the upper
layer. Department of CSE- Data Science
Department of Computer Science & Engineering
Security
Security was not a concern when the Internet was originally designed because it
was used by a small number of users at universities for research activities; other
people had no access to the Internet.
Packet Switching
At the network layer, a message from the upper layer is divided into manageable
packets and each packet is sent through the network.
The source of the message sends the packets one by one; the destination of the
message receives the packets one by one.
The destination waits for all packets belonging to the same message to arrive
before delivering the message to the upper layer.
The connecting devices in a packet-switched network still need to decide how to
route the packets to the final destination
Two different approaches to route the packets:
1. Datagram approach
2. Virtual circuit approach
The network layer is only responsible for delivery of packets from the source to the
destination.
The packets in a message may or may not travel the same path to their destination.
Each packet is routed based on the information contained in its header: source
and destination addresses.
The destination address defines where it should go; the source address defines
where it comes from.
The router in this case routes the packet based only on the destination address.
The source address may be used to send an error message to the source if the
packet is discarded.
1. Setup Phase
A router creates an entry for a
virtual circuit.
For example, suppose source A
needs to create a virtual circuit
to destination B.
Two auxiliary packets need to
be exchanged between the
sender and the receiver: the
request packet and the
acknowledgment packet.
Figure : Forwarding process in a router when
used in a virtual-circuit network
Request packet
A request packet is sent from the source to the destination.
This auxiliary packet carries the source and destination addresses.
Acknowledgment Packet
A special packet, called the acknowledgment packet, completes the entries in the
switching tables Department of CSE- Data Science
Department of Computer Science & Engineering
3. Router R3 receives the setup request packet. The same events happen here as at router R1;
three columns of the table are completed: in this case, incoming port (1), incoming label
(66), and outgoing port (3).
4. Router R4 receives the setup request packet. Again, three columns are completed:
incoming port (1), incoming label (22), and outgoing port (4).
5. Destination B receives the setup packet, and if it is ready to receive packets from A, it
assigns a label to the incoming packets
Department ofthat comeData
CSE- from Science
A, in this case 77. This label lets the
destination know that the packets come from A, and not from other sources.
Department of Computer Science & Engineering
3. Router R3 sends an acknowledgment to router R1 that contains its incoming label in the
table, chosen in the setup phase. Router R1 uses this as the outgoing label in the table.
4. Finally router R1 sends an acknowledgment to source A that contains its incoming label in
the table, chosen in the setup phase.
5. The source uses this as the outgoing label for the data packets to be sent to destination B.
Department of Computer Science & Engineering
2. Data-Transfer Phase
After all routers have created their forwarding table for a specific virtual circuit,
then the network-layer packets belonging to one message can be sent one after
another. The source computer uses the
label 14, which it has received
from router R1 in the setup
phase.
Router R1 forwards the packet
to router R3, but changes the
label to 66.
Router R3 forwards the packet
to router R4, but changes the
label to 22.
Finally, router R4 delivers the
packet to its final destination
with the label 77.
All the packets in the message
follow the same sequence of
labels, and the packets arrive in
order at the destination.
Figure : Flow of one packet in an established
Department
virtual circuitof CSE- Data Science
Department of Computer Science & Engineering
3. Teardown Phase
In the teardown phase, source A, after sending all packets to B, sends a special
Ipv4 Addresses
The identifier used in the IP layer of the TCP/IP protocol suite to identify the
connection of each device to the Internet is called the Internet address or IP address.
An IPv4 address is a 32-bit address that uniquely and universally defines the
connection of a host or a router to the Internet.
The IP address is the address of the connection, not the host or the router, because if
the device is moved to another network, the IP address may be changed.
IPv4 addresses are unique in the sense that each address defines one, and only one,
connection to the Internet.
If a device has two connections to the Internet, via two networks, it has two IPv4
addresses.
IPv4 addresses are universal in the sense that the addressing system must be accepted
by any host that wants to be connected to the Internet.
Department of CSE- Data Science
Department of Computer Science & Engineering
Address Space
An address space is the total number of addresses used by the protocol.
If a protocol uses b bits to define an address, the address space is 2b because each
bit can have two different values (0 or 1).
IPv4 uses 32-bit addresses, which means that the address space is 232 or
4,294,967,296 (more than four billion).
If there were no restrictions, more than 4 billion devices could be connected to the
Internet.
Notation
Hierarchy in Addressing
The prefix length is n bits and the suffix length is (32 − n) bits.
The network identifier in the IPv4 was first designed as a fixed-length prefix. This
scheme, which is now obsolete, is referred to as classful addressing.
Classful Addressing
When the Internet started, an IPv4 address was designed with a fixed-length
prefix, but to accommodate both small and large networks, three fixed-length
prefixes were designed instead of one (n = 8, n = 16, and n = 24).
The whole address space was divided into five classes (class A, B, C, D, and E)
In class A, the network length is 8 bits, but since the first bit, which is 0, defines
the class, we can have only seven bits as the network identifier. This means there
are only 27 = 128 networks in the world that can have a class A address.
In class B, the network length is 16 bits, but since the first two bits, which are
(10)2, define the class, we can have only 14 bits as the network identifier. This
means there are only 214 = 16,384 networks in the world that can have a class B
address.
All addresses that start with (110)2 belong to class C. In class C, the network
length is 24 bits, but since three bits define the class, we can have only 21 bits as
the network identifier. This means there are 221 = 2,097,152 networks in the world
that can have a class C address.
Class D is not divided into prefix and suffix. It is used for multicast addresses.
All addresses that start with 1111 in binary belong to class E. As in Class D,
Department of CSE- Data Science
Class E is not divided into prefix and suffix and is used as reserve.
Department of Computer Science & Engineering
Address Depletion
The reason that classful addressing has become obsolete is address depletion.
Since the addresses were not distributed properly, the Internet was faced with
the problem of the addresses being rapidly used up, resulting in no more
addresses available for organizations and individuals that needed to be
connected to the Internet.
Let us think about class A. This class can be assigned to only 128 organizations
in the world, but each organization needs to have a single network (seen by the
rest of the world) with 16,777,216 nodes (computers in this single network).
Since there may be only a few organizations that are this large, most of the
addresses in this class were wasted (unused)
Class B addresses were designed for midsize organizations, but many of the
addresses in this class also remained unused.
Class C addresses have a completely different flaw in design. The number of
addresses that can be used in each network (256) was so small that most
companies were not comfortable using a block in this address class.
Class E addresses were almost never used, wasting the whole class.
Example
Find the class of each address.
a. 00000001 00001011 00001011 11101111
b. 11000001 10000011 00011011 11111111
c. 14.23.120.8
d. 252.5.15.111
Solution
a. The first bit is 0. This is a class A address.
b. The first 2 bits are 1; the third bit is 0. This is a class C address.
c. The first byte is 14; the class is A.
d. The first byte is 252; the class is E.
Classless Addressing
In classless addressing, the whole address space is divided into variable length
blocks.
The prefix in an address defines the block (network); the suffix defines the node
(device).
Theoretically, we can have a block of 20, 21, 22, . . . , 232 addresses.
One of the restrictions is that the number of addresses in a block needs to be a
power of 2.
An organization can be granted one block of addresses.
The prefix length in classless addressing is variable. We can have a prefix length
that ranges from 0 to 32.
The size of the network is inversely proportional to the length of the prefix. A
small prefix means a larger network; a large prefix means a smaller network.
Department of CSE- Data Science
Department of Computer Science & Engineering
Example 1
2. The first address can be found by keeping the first 27 bits and changing the rest
of the bits to 0s.
Address Mask
Another way to find the first and last addresses in the block is to use the address
mask.
The address mask is a 32-bit number in which the n leftmost bits are set to 1s and
the rest of the bits (32 − n) are set to 0s.
A computer can easily find the address mask because it is the complement of (232 −
n− 1).
The reason for defining a mask in this way is that it can be used by a computer
program to extract the information in a block, using the three bit-wise operations
NOT, AND, and OR.
2. The first address in the block = (Any address in the block) AND (mask).
Example 2
Network Address
Given any address, we can find all information about the block.
The first address, the network address, is particularly important because it is used
in routing a packet to its destination network.
For the moment, let us assume that an internet is made of m networks and a router
with m interfaces.
When a packet arrives at the router from any source host, the router needs to know
to which network the packet should be sent: from which interface the packet should
be sent out.
After the network address has been found, the router consults its forwarding table
to find the corresponding interface from which the packet should be sent out.
Block Allocation
The next issue in classless addressing is block allocation. How are the blocks
allocated?
The ultimate responsibility of block allocation is given to a global authority called
the Internet Corporation for Assigned Names and Numbers (ICANN).
ICANN does not normally allocate addresses to individual Internet users. It
assigns a large block of addresses to an ISP.
For the proper operation of the CIDR, two restrictions need to be applied to the
allocated block.
1. The number of requested addresses, N, needs to be a power of 2. The reason is
that N = 232 − n or n = 32 − log2N. If N is not a power of 2, we cannot have an
integer value for n.
Example 3
An ISP has requested a block of 1000 addresses.
Since 1000 is not a power of 2, 1024 addresses are granted. The prefix length is
calculated as n = 32 − log21024 = 22.
An available block, 18.14.12.0/22, is granted to the ISP.
It can be seen that the first address in decimal is 302,910,464, which is divisible
by 1024.
Subnetting
More levels of hierarchy can be created using subnetting.
An organization (or an ISP) that is granted a range of addresses may divide the range
into several subranges and assign each subrange to a subnetwork (or subnet).
A subnetwork can be divided into several sub-subnetworks. A sub-subnetwork can be
divided into several sub-sub-subnetworks, and so on.
Designing Subnets
• The subnetworks in a network should be carefully designed to enable the routing
of packets.
• We assume the total number of addresses granted to the organization is N, the
prefix length is n, the assigned number of addresses to each subnetwork is Nsub,
and the prefix length for each subnetwork is nsub.
Department of Computer Science & Engineering
Example
If we add all addresses in the previous subblocks, the result is 208 addresses,
which means 48 addresses are left in reserve.
The first address in this range is 14.24.74.208. The last address is 14.24.74.255
Address Aggregation
One of the advantages of the CIDR strategy is address aggregation (sometimes
called address summarization or route summarization).
When blocks of addresses are combined to create a larger block, routing can be
done based on the prefix of the larger block.
ICANN assigns a large block of addresses to an ISP. Each ISP in turn divides its
assigned block into smaller subblocks and grants the subblocks to its customers.
Figure shows how four small blocks of addresses are assigned to four
organizations by an ISP.
The ISP combines these four blocks into one single block and advertises the larger
block to the rest of the world.
Any packet destined for this larger block should be sent to this ISP. It is the
responsibility of the ISP to forward the packet to the appropriate organization.
This is similar to routing we can find in a postal network. All packages coming
from outside a country are sent first to the capital and then distributed to the
Department of CSE- Data Science
corresponding destination.
Department of Computer Science & Engineering
Special Addresses
Five special addresses that are used for special purposes:
1. This-host address
2. Limited-broadcast address
3. Loopback address
4. Private addresses
5. Multicast addresses
1. This-host Address
The only address in the block 0.0.0.0/32 is called the this-host address.
It is used whenever a host needs to send an IP datagram but it does not know its
own address to use as the source address.
2. Limited-broadcast Address
The only address in the block 255.255.255.255/32 is called the limited-broadcast
address.
It is used whenever a router or a host needs to send a datagram to all devices in a
network.
The routers in the network, however, block the packet having this address as the
destination; the packet cannot travel outside the network.
3. Loopback Address
The block 127.0.0.0/8 is called the loopback address.
A packet with one of the addresses in this block as the destination address never
leaves the host; it will remain in the host.
Any address in the block is used to test a piece of software in the machine. For
example, we can write a client and a server program in which one of the
addresses in the block is used as the server address.
We can test the programs using the same host to see if they work before running
them on different computers.
4. Private Addresses
Four blocks are assigned as private addresses: 10.0.0.0/8, 172.16.0.0/12,
192.168.0.0/16, and 169.254.0.0/16.
5. Multicast Addresses
The block 224.0.0.0/4 is reserved for multicast addresses.
Department of Computer Science & Engineering
2. Prefix
3. Address of a router
The 64-byte option field has a dual purpose. It can carry either additional information or
some specific vendor information.
The server uses a number, called a magic cookie, in the format of an IP address with the
value of 99.130.83.99.
When the client finishes reading the message, it looks for this magic cookie.
If present, the next 60 bytes are options.
Department of Computer Science & Engineering
An option is composed of three fields: a 1-byte tag field, a 1-byte length field, and
a variable-length value field.
There are several tag fields that are mostly used by vendors. If the tag field is 53,
the value field defines one of the 8 message types
The joining host creates a DHCPDISCOVER message in which only the transaction- ID
field is set to a random number. No other field can be set because the host has no knowledge
with which to do so.
This message is encapsulated in a UDP user datagram with the source port set to 68 and the
destination port set to 67.
The user datagram is encapsulated in an IP datagram with the source address set to 0.0.0.0
(“this host”) and the destination address set to 255.255.255.255 (broadcast address).
The reason is that the joining host knows neither its own address nor the server address.
Department of Computer Science & Engineering
DHCP Operation
DHCP Operation
Finally, the selected server responds with a
DHCPACK message to the client if the
offered IP address is valid.
If the server cannot keep its offer (for
example, if the address is offered to another
host in between), the server sends a
DHCPNACK message and the client needs
to repeat the process.
This message is also broadcast to let other
servers know that the request is accepted or
rejected.
Department of Computer Science & Engineering
Error Control
DHCP uses the service of UDP, which is not reliable. To provide error control,
DHCP uses two strategies.
First, DHCP requires that UDP use the checksum
Second, the DHCP client uses timers and a retransmission policy if it does not
receive the DHCP reply to a request.
However, to prevent a traffic jam when several hosts need to retransmit a
request (for example, after a power failure), DHCP forces the client to use a
random number to set its timers.
Department of Computer Science & Engineering
Transition States
To provide dynamic address allocation, the DHCP client acts as a state machine
that performs transitions from one state to another depending on the messages it
receives or sends.
Department of Computer Science & Engineering
When the DHCP client first starts, it is in the INIT state (initializing state).
The client broadcasts a discover message. When it receives an offer, the client
goes to the SELECTING state. While it is there, it may receive more offers.
After it selects an offer, it sends a request message and goes to the REQUESTING
state.
If an ACK arrives while the client is in this state, it goes to the BOUND state and
uses the IP address.
When the lease is 50 percent expired, the client tries to renew it by moving to the
RENEWING state.
If the server renews the lease, the client moves to the BOUND state again.
If the lease is not renewed and the lease time is 75 percent expired, the client
moves to the REBINDING state.
Department of Computer Science & Engineering
If the server agrees with the lease (ACK message arrives), the client moves to
the BOUND state and continues using the IP address; otherwise, the client
moves to the INIT state and requests another IP address.
Note that the client can use the IP address only when it is in the BOUND,
RENEWING, or REBINDING state.
The above procedure requires that the client uses three timers: renewal timer
(set to 50 percent of the lease time), rebinding timer (set to 75 percent of the
lease time), and expiration timer (set to the lease time).
Department of Computer Science & Engineering
It’s a way to map multiple private addresses inside a local network to a public IP
address before transferring the information onto the internet.
Organizations that want multiple devices to employ a single IP address use NAT,
as do most home routers.
Department of Computer Science & Engineering
The private network uses private addresses. The router that connects the network
to the global address uses one private address and one global address.
The private network is invisible to the rest of the Internet; the rest of the Internet
sees only the NAT router with the address 200.24.5.8.
Department of Computer Science & Engineering
Address Translation
All of the outgoing packets go through the NAT router, which replaces the source
address in the packet with the global NAT address.
All incoming packets also pass through the NAT router, which replaces the
destination address in the packet (the NAT router global address) with the
appropriate private address.
Translation Table
how does the NAT router know the destination address for a packet coming from
the Internet?
There may be tens or hundreds of private IP addresses, each belonging to one
specific host. The problem is solved if the NAT router has a translation table.
Using One IP Address
In its simplest form, a translation table has only two columns: the private address
and the external address (destination address of the packet).
When the router translates the source address of the outgoing packet, it also makes
note of the destination address— where the packet is going.
When the response comes back from the destination, the router uses the source
address of the packet (as the external address) to find the private address of the
packet.
Department of Computer Science & Engineering
Figure : Translation
In this strategy, communication must always be initiated by the private network.
The NAT mechanism described requires that the private network start the
communication.
Department of Computer Science & Engineering
The use of only one global address by the NAT router allows only one private-
network host to access a given external host.
To remove this restriction, the NAT router can use a pool of global addresses.
For example, instead of using only one global address (200.24.5.8), the NAT router
can use four addresses (200.24.5.8, 200.24.5.9, 200.24.5.10, and 200.24.5.11).
In this case, four private-network hosts can communicate with the same external
host at the same time because each pair of addresses defines a separate connection.
Drawbacks.
‣ No private-network host can access two external server programs (e.g., HTTP and
TELNET) at the same time.
‣ Two private-network hosts cannot access the same external server program (e.g.,
Department of Computer Science & Engineering
When the response from HTTP comes back, the combination of source address
(25.8.3.2) and destination port address (1401) defines the private network host to which
the response should be directed.
Also that for this translation to work, the ephemeral port addresses (1400 and 1401) must
be unique.
Department of Computer Science & Engineering
IPv6 Datagram
The main reason for migration from IPv4 to IPv6 is the small size of the address
space in IPv4
The change of the IPv6 address size requires the change in the IPv4 packet format
The following shows other changes implemented in the protocol in addition to
changing address size and format.
‣ Better header format. IPv6 uses a new header format in which options are
separated from the base header and inserted, when needed, between the base header
and the data. This simplifies and speeds up the routing process because most of the
options do not need to be checked by routers.
‣ New options. IPv6 has new options to allow for additional functionalities.
‣ Allowance for extension. IPv6 is designed to allow the extension of the protocol if
required by new technologies or applications.
Department of Computer Science & Engineering
‣ Support for resource allocation. In IPv6, the type-of-service field has been
removed, but two new fields, traffic class and flow label, have been added to
enable the source to request special handling of the packet. This mechanism can
be used to support traffic such as real-time audio and video.
‣ Support for more security. The encryption and authentication options in IPv6
provide confidentiality and integrity of the packet.
Packet Format
Each packet is composed of a base header followed by the payload.
The base header occupies 40 bytes, whereas payload can be up to 65,535 bytes
of information.
Department of Computer Science & Engineering
Traffic class. The 8-bit traffic class field is used to distinguish different payloads
with different delivery requirements. It replaces the type-of-service field in IPv4.
Flow label. The flow label is a 20-bit field that is designed to provide special
handling for a particular flow of data. We will discuss this field later.
Payload length. The 2-byte payload length field defines the length of the IP
datagram excluding the header. Note that IPv4 defines two fields related to the
length: header length and total length. In IPv6, the length of the base header is
fixed (40 bytes); only the length of the payload needs to be defined.
Department of Computer Science & Engineering
Next header. The next header is an 8-bit field defining the type of the first
extension header (if present) or the type of the data that follows the base header in
the datagram.
Hop limit. The 8-bit hop limit field serves the same purpose as the TTL field in
IPv4.
Source and destination addresses. The source address field is a 16-byte (128-bit)
Internet address that identifies the original source of the datagram. The destination
address field is a 16-byte (128-bit) Internet address that identifies the destination
of the datagram.
Department of Computer Science & Engineering
In IPv6, options, which are part of the header in IPv4, are designed as extension
headers.
The payload can have as many extension headers as required by the situation.
Each extension header has two mandatory fields, next header and the length,
followed by information related to the particular option.
Each next header field value (code) defines the type of the next header (hop-by-
hop option, source routing option, . . .); the last next header field defines the
protocol (UDP, TCP, . . .) that is carried by the datagram.
Department of Computer Science & Engineering
IPv6 datagrams can be fragmented only by the source, not by the routers; the
reassembly takes place at the destination.
The packet needs to be fragmented, all fields related to the fragmentation need to
be recalculated
In IPv6, the source can check the size of the packet and make the decision to
fragment the packet or not.
When a router receives the packet, it can check the size of the packet and drop it
if the size is larger than allowed by the MTU of the network ahead.
The router then sends a packet-too-big ICMPv6 error message to inform the
source.
Department of Computer Science & Engineering
Extension Header
An IPv6 packet is made of a base header and some extension headers.
The length of the base header is fixed at 40 bytes. However, to give more
functionality to the IP datagram, the base header can be followed by up to six
extension headers.
Many of these headers are options in IPv4. Six types of extension headers have
been defined.
Department of Computer Science & Engineering
Hop-by-Hop Option
The hop-by-hop option is used when the source needs to pass information to all routers
visited by the datagram.
For example, perhaps routers must be informed about certain management, debugging, or
control functions. Or, if the length of the datagram is more than the usual 65,535 bytes,
routers must have this information.
So far, only three hop by hop options have been defined:.
‣ Pad1. This option is 1 byte long and is designed for alignment purposes. Some options
need to start at a specific bit of the 32-bit word. If an option falls short of this requirement
by exactly one byte, Pad1 is added.
‣ PadN. PadN is similar in concept to Pad1. The difference is that PadN is used when 2 or
more bytes are needed for alignment.
‣ Jumbo payload. Recall that the length of the payload in the IP datagram can be a
maximum of 65,535 bytes. However, if for any reason a longer payload is required, we
can use the jumbo payload option to define this longer length.
Department of Computer Science & Engineering
Destination Option
The destination option is used when the source needs to pass information to the
destination only.
Intermediate routers are not permitted access to this information.
The format of the destination option is the same as the hop-by-hop option. So
far, only the Pad1 and PadN options have been defined.
Source Routing
The source routing extension header combines the concepts of the strict source
route and the loose source route options of IPv4.
Department of Computer Science & Engineering
Fragmentation
In IPv4, the source or a router is required to fragment if the size of the datagram is
larger than the MTU of the network over which the datagram travels.
In IPv6, only the original source can fragment. A source must use a Path MTU
Discovery technique to find the smallest MTU supported by any network on the
path.
The source then fragments using this knowledge. If the source does not use a Path
MTU Discovery technique, it fragments the datagram to a size of 1280 bytes or
smaller.
This is the minimum size of MTU required for each network connected to the
Internet.
Department of Computer Science & Engineering
Authentication
The authentication extension header has a dual purpose: it validates the message
The former is needed so the receiver can be sure that a message is from the
genuine sender and not from an imposter.
The latter is needed to check that the data is not altered in transition by some
hacker.
In an internet, the goal of the network layer is to deliver a datagram from its
source to its destination or destinations.
‣ If a datagram is destined for only one destination (one-to-one delivery), we have
unicast routing.
Unicast Routing
General Idea
In unicast routing, a packet is routed, hop by hop, from its source to its
destination by the help of forwarding tables.
The source host needs no forwarding table because it delivers its packet to the
default router in its local network.
The destination host needs no forwarding table either because it receives the
packet from its default router in its local network.
This means that only the routers that glue together the networks in the internet
need forwarding tables.
There are several routes that a packet can travel from the source to the
destination; what must be determined is which route the packet should take.
Department of Computer Science & Engineering
An Internet as a Graph
Least-Cost Routing
When an internet is modeled as a weighted graph, one of the ways to interpret the
best route from the source router to the destination router is to find the least cost
between the two.
The source router chooses a route to the destination router in such a way that the
total cost for the route is the least cost among all possible routes.
In Figure the best route between A and E is A-B-E, with the cost of 6.
This means that each router needs to find the least-cost route between itself and
all the other routers to be able to route a packet using this criteria.
Department of Computer Science & Engineering
Least-Cost Trees
If there are N routers in an internet, there are (N − 1) least-cost paths from each router to
any other router.
This means we need N × (N − 1) least-cost paths for the whole internet.
If we have only 10 routers in an internet, we need 90 least-cost paths.
A better way to see all of these paths is to combine them in a least-cost tree.
A least-cost tree is a tree with the source router as the root that spans the whole graph
(visits all other nodes) and in which the path between the root and any other node is the
shortest.
In this way, we can have only one shortest-path tree for each node; we have N least-cost
trees for the whole internet.
Department of Computer Science & Engineering
The least-cost trees for a weighted graph can have several properties if they are created
using consistent criteria.
1. The least-cost route from X to Y in X’s tree is the inverse of the least-cost route from Y to
X in Y’s tree; the cost in both directions is the same. For example, in Figure, the route
from A to F in A’s tree is (A → B → E → F), but the route from F to A in F’s tree is (F →
E → B → A), which is the inverse of the first route. The cost is 8 in each case.
2. Instead of travelling from X to Z using X’s tree, we can travel from X to Y using X’s tree
and continue from Y to Z using Y’s tree. For example, in Figure , we can go from A to G
in A’s tree using the route (A → B → E → F → G). We can also go from A to E in A’s tree
(A → B → E) and then continue in E’s tree using the route (E → F → G). The
combination of the two routes in the second case is the same route as in the first case. The
cost in the first case is 9; the cost in the second case is also 9 (6 + 3).
Department of Computer Science & Engineering
Routing Algorithms
1. Distance-Vector Routing
2. Link-State Routing
3. Path-Vector Routing
Distance-Vector Routing
In distance-vector routing, the first thing each node creates is its own least-cost tree with
the rudimentary information it has about its immediate neighbors.
The incomplete trees are exchanged between immediate neighbors to make the trees more
and more complete and to represent the whole internet.
In distance-vector routing, a router continuously tells all of its neighbors what it knows
about the whole internet (although the knowledge can be incomplete).
Bellman-Ford Equation
The heart of distance-vector routing is the famous Bellman-Ford equation.
This equation is used to find the least cost (shortest distance) between a source node, x,
and a destination node, y, through some intermediary nodes (a, b, c, . . .) when the costs
between the source and the intermediary nodes and the least costs between the
intermediary nodes and the destination are given.
Department of Computer Science & Engineering
The general case in which Dij is the shortest distance and cij is the cost between nodes i
and j.
In distance-vector routing, normally we want to update an existing least cost with a least
cost through an intermediary node, such as z, if the latter is shorter. In this case, the
equation becomes simpler, as shown below:
Distance Vectors
The concept of a distance vector is the rationale for the name distance-vector
routing.
A least-cost tree is a combination of least-cost paths from the root of the tree to
all destinations. These paths are graphically glued together to form the tree.
Distance-vector routing unglues these paths and creates a distance vector, a one-
dimensional array to represent the tree
The name of the distance vector defines the root, the indexes define the
destinations, and the value of each cell defines the least cost from the root to the
destination. Department of CSE- Data Science
Department of Computer Science & Engineering
A distance vector does not give the path to the destinations as the least-cost tree
does; it gives only the least costs to the destinations.
Each node in an internet, when it is booted, creates a very rudimentary distance
vector with the minimum information the node can obtain from its neighborhood.
The node sends some greeting messages out of its interfaces and discovers the
identity of the immediate neighbors and the distance between itself and each
neighbor.
It then makes a simple distance vector by inserting the discovered distances in
the corresponding cells and leaves the value of other cells as infinity.
In the first event, node A has sent its vector to node B. Node B updates its vector using the
cost cBA = 2.
In the second event, node E has sent its vector to node B. Node B updates its vector using the
cost cEA = 4.
After the first event, node B has one improvement in its vector: its least cost to node D has
changed from infinity to 5 (via node A).
After the second event, node B has one more improvement in its vector; its least cost to node
Department
F has changed from infinity of CSE-
to 6 (via node E). Data Science
Department of Computer Science & Engineering
Lines 14 to 23 show how the vector can be updated after receiving a vector from the
immediate neighbor.
The for loop in lines 17 to 20 allows all entries (cells) in the vector to be updated
after receiving a new vector.
Note that the node sends its vector in line 12, after being initialized, and in line 22,
after it is updated.
Department of Computer Science & Engineering
Count to Infinity
A problem with distance-vector routing is that any decrease in cost (good news)
propagates quickly, but any increase in cost (bad news) will propagate slowly.
For a routing protocol to work properly, if a link is broken (cost becomes
infinity), every other router should be aware of it immediately, but in distance-
vector routing, this takes some time.
The problem is referred to as count to infinity.
It sometimes takes several updates before the cost for a broken link is recorded as
infinity by all routers.
Two-Node Loop
One example of count to infinity is the two-node loop problem. To understand
the problem, let us look at the scenario depicted in Figure below
Split Horizon
In this strategy, instead of flooding the table through each interface, each node
sends only part of its table through each interface.
If, according to its table, node B thinks that the optimum route to reach X is via
A, it does not need to advertise this piece of information to A; the information
has come from A (A already knows).
Taking information from node A, modifying it, and sending it back to node A is
what creates the confusion.
In our scenario, node B eliminates the last line of its forwarding table before it
sends it to A. In this case, node A keeps the value of infinity as the distance to X.
Later, when node A sends its forwarding table to B, node B also corrects its
forwarding table. Department of CSE- Data Science
Department of Computer Science & Engineering
Poison Reverse
Using the split-horizon strategy has one drawback.
Normally, the corresponding protocol uses a timer, and if there is no news about a
route, the node deletes the route from its table.
When node B in the previous scenario eliminates the route to X from its
advertisement to A, node A cannot guess whether this is due to the split-horizon
strategy (the source of information was A) or because B has not received any news
about X recently.
In the poison reverse strategy B can still advertise the value for X, but if the source
of information is A, it can replace the distance with infinity as a warning: “Do not
use this value; what I know about this route comes from you.”
Three-Node Instability
The two-node instability can be avoided using split horizon combined with poison
reverse. However, ifDepartment
the instability is between
of CSE- three nodes, stability cannot be
Data Science
Department of Computer Science & Engineering
Link-State Routing
In this algorithm the cost associated with an edge defines the state of the link.
Links with lower costs are preferred to links with higher costs; if the cost of a
link is infinity, it means that the link does not exist or has been broken.
To create a least-cost tree with this method, each node needs to have a complete
map of the network, which means it needs to know the state of each link.
The collection of states for all links is called the link-state database (LSDB).
There is only one LSDB for the whole internet; each node needs to have a
duplicate of it to be able to create the least-cost tree.
The combination of these two pieces of information is called the LS packet (LSP); the
LSP is sent out of each interface, as shown in Figure
Figure: LSPs created and sent out by each node to build LSDB
When a node receives an LSP from one of its interfaces, it compares the LSP with the
copy it may already have.
If the newly arrived LSP is older than the one it has (found by checking the sequence
Department
number), it discards the LSP of CSE- Data Science
Department of Computer Science & Engineering
If it is newer or the first one received, the node discards the old LSP (if there is
one) and keeps the received one. It then sends a copy of it out of each interface
except the one from which the packet arrived.
This guarantees that flooding stops somewhere in the network (where a node has
only one interface).
A node can make the whole map if it needs to, using this LSDB.
Comparison of link-state routing algorithm with the distance-vector routing
In the distance-vector routing algorithm, each router tells its neighbors what it
knows about the whole internet;
in the link-state routing algorithm, each router tells the whole internet what it
knows about its neighbors.
Dijkstra’s Algorithm
Path-Vector Routing
Both link-state and distance-vector routing are based on the least-cost goal.
There are instances where this goal is not the priority.
For example, assume that there are some routers in the internet that a sender wants
to prevent its packets from going through.
For example, a router may belong to an organization that does not provide enough
security or it may belong to a commercial rival of the sender which might inspect
the packets for obtaining information.
Least-cost routing does not prevent a packet from passing through an area when
that area is in the least-cost path.
In path-vector (PV) routing, the best route is determined by the source using the
policy it imposes on the route.
One of the common policies uses the minimum number of nodes to be visited
(something similar to least-cost).
Another common policy is to avoid some nodes as the middle node in a route.
Each source has created its own spanning tree that meets its policy.
The policy imposed by all sources is to use the minimum number of nodes to
reach a destination.
The spanning tree selected by A and E is such that the communication does not
pass through D as a middle node.
Similarly, the spanning tree selected by B is such that the communication does
not pass through C as a middle node.
Creation of Spanning Trees
Path-vector routing, like distance-vector routing, is an asynchronous and
distributed routing algorithm.
The spanning trees are made, gradually and asynchronously, by each node.
Department of Computer Science & Engineering
When a node is booted, it creates a path vector based on the information it can
obtain about its immediate neighbor.
A node sends greeting messages to its immediate neighbors to collect these
pieces of information
Each node, after the creation of the initial path vector, sends it to all its
immediate neighbors.
Each node, when it receives a path vector from a neighbor, updates its path
vector using an equation similar to the Bellman-Ford, but applying its own policy
instead of looking for the least cost.
Department of Computer Science & Engineering
In this equation, the operator (+) means to add x to the beginning of the path.
The policy is defined by selecting the best of multiple paths. Path-vector routing
also imposes one more condition on this equation: If Path (v, y) includes x, that
path is discarded to avoid a loop in the path.
In other words, x does not want to visit itself when it selects a path to y.
Department of Computer Science & Engineering
Path-Vector Algorithm
Lines 17 to 24 show how the node updates its vector after receiving a vector from
the neighbor.
The update process is repeated forever.
Department of Computer Science & Engineering
Internet Structure
The Internet has changed from a tree-like structure, with a single backbone, to a
multi-backbone structure run by different private corporations today.
Although it is difficult to give a general view of the Internet today, we can say that
the Internet has a structure similar to what is shown in Figure
Hierarchical Routing
The Internet today is made of a huge number of networks and routers that
connect them.
It is obvious that routing in the Internet cannot be done using a single protocol
for two reasons: a scalability problem and an administrative issue.
Scalability problem means that the size of the forwarding tables becomes huge,
searching for a destination in a forwarding table becomes time-consuming, and
updating creates a huge amount of traffic.
The administrative issue is related to the Internet structure described in Figure 1.
Each ISP is run by an administrative authority. The administrator needs to have
control in its system.
Department of Computer Science & Engineering
Autonomous Systems
Each ISP is an autonomous system when it comes to managing networks and
routers under its control.
Each AS is given an autonomous number (ASN) by the ICANN.
Each ASN is a 16-bit unsigned integer that uniquely defines an AS.
The autonomous systems are categorized according to the way they are
connected to other ASs.
Department of Computer Science & Engineering
Note that the network in which the source host is connected is not counted in this
calculation because the source host does not use a forwarding table; the packet is
delivered to the default router.
Forwarding Tables
The routers in an autonomous system need to keep forwarding tables to forward
packets to their destination networks.
A forwarding table in RIP is a three-column table in which the first column is the
address of the destination network, the second column is the address of the next
router to which the packet should be forwarded, and the third column is the cost
(the number of hops) to reach the destination network.
Figure shows the three forwarding tables for the routers in Figure-2.
The first and the third columns together convey the same information as does a
distance vector, but the cost shows the number of hops to the destination networks.
Department of Computer Science & Engineering
Although a forwarding table in RIP defines only the next router in the second
column, it gives the information about the whole least-cost tree based on the
second property of these trees.
For example, R1 defines that the next router for the path to N4 is R2; R2 defines
that the next router to N4 is R3; R3 defines that there is no next router for this path.
The tree is then R1 → R2 → R3 → N4.
RIP Implementation
RIP is implemented as a process that uses the service of UDP on the well-known
port number 520.
RIP runs at the application layer, but creates forwarding tables for IP at the network
layer.
Department of Computer Science & Engineering
Part of the message, which we call entry, can be repeated as needed in a message.
Each entry carries the information related to one line in the forwarding table of
the router that sends the message.
Department of Computer Science & Engineering
A request message is sent by a router that has just come up or by a router that has
message.
RIP Algorithm
‣ Instead of sending only distance vectors, a router needs to send the whole
contents of its forwarding table in a response message.
‣ The receiver adds one hop to each cost and changes the next router field to the
address of the sending router. We call each route in the modified forwarding
table the received route and each route in the old forwarding table the old
route. The received router selects the old routes as the new ones except in the
following three cases:
1. If the received route does not exist in the old forwarding table, it should be
added to the route.
Department of Computer Science & Engineering
2.If the cost of the received route is lower than the cost of the old one, the received route
should be selected as the new one.
3.If the cost of the received route is higher than the cost of the old one, but the value of the
next router is the same in both routes, the received route should be selected as the new one.
This is the case where the route was actually advertised by the same router in the past, but
now the situation has been changed. For example, suppose a neighbor has previously
advertised a route to a destination with cost 3, but now there is no path between this
neighbor and that destination. The neighbor advertises this destination with cost value
infinity (16 in RIP). The receiving router must not ignore this value even though its old
route has a lower cost to the same destination.
‣ The new forwarding table needs to be sorted according to the destination route (mostly
using the longest prefix first).
Department of Computer Science & Engineering
Example of an
autonomous
system using RIP
Department of Computer Science & Engineering
Timers in RIP
RIP uses three timers to support its operation.
The periodic timer controls the advertising of regular update messages.
‣ Each router has one periodic timer that is randomly set to a number between 25
and 35 seconds (to prevent all routers sending their messages at the same time
and creating excess traffic).
‣ The timer counts down; when zero is reached, the update message is sent, and
the timer is randomly set once again.
The expiration timer governs the validity of a route.
‣ When a router receives update information for a route, the expiration timer is
set to 180 seconds for that particular route.
‣ Every time a new update for the route is received, the timer is reset. If there is a
problem on an internet and no update is received within the allotted 180
seconds, the route is considered expired and the hop count of the route is set to
16, which means the destination is unreachable. Every route has its own
expiration timer.
Department of Computer Science & Engineering
The garbage collection timer is used to purge a route from the forwarding table.
‣ When the information about a route becomes invalid, the router does not
immediately purge that route from its table.
‣ Instead, it continues to advertise the route with a metric value of 16. At the
same time, a garbage collection timer is set to 120 seconds for that route.
‣ When the count reaches zero, the route is purged from the table.
‣ This timer allows neighbors to become aware of the invalidity of a route prior
to purging.
Department of Computer Science & Engineering
Performance
Before ending this section, let us briefly discuss the performance of RIP:
1. Update Messages. The update messages in RIP have a very simple format and
are sent only to neighbors; they are local. They do not normally create traffic
because the routers try to avoid sending them at the same time.
2. Convergence of Forwarding Tables. RIP uses the distance-vector algorithm,
which can converge slowly if the domain is large, but, since RIP allows only 15
hops in a domain (16 is considered as infinity), there is normally no problem in
convergence.
• The only problems that may slow down convergence are count-to-infinity and
loops created in the domain; use of poison-reverse and split-horizon strategies
added to the RIP extension may alleviate the situation.
Department of Computer Science & Engineering
3. Robustness.
As we said before, distance-vector routing is based on the concept that each
router sends what it knows about the whole domain to its neighbors.
This means that the calculation of the forwarding table depends on information
received from immediate neighbors, which in turn receive their information from
their own neighbors.
If there is a failure or corruption in one router, the problem will be propagated to
all routers and the forwarding in each router will be affected.
Department of Computer Science & Engineering
Figure
:Metric in
OSPF
Forwarding Tables
Each OSPF router can create a forwarding table after finding the shortest-path tree
between itself and the destination using Dijkstra’s algorithm
Department of Computer Science & Engineering
Areas
OSPF was designed to be able to handle routing in a small or large autonomous
system.
The formation of shortest-path trees in OSPF requires that all routers flood the
whole AS with their LSPs to create the global LSDB.
OSPF uses another level of hierarchy in routing: the first level is the autonomous
system, the second is the area.
Each router in an area needs to know the information about the link states not only
in its area but also in other areas.
For this reason, one of the areas in the AS is designated as the backbone area,
responsible for gluing the areas together.
The routers in the backbone area are responsible for passing the information
collected by each area to all other areas.
Department of Computer Science & Engineering
Link-State Advertisement
OSPF is based on the link-state routing algorithm, which requires that a router
advertise the state of each link to all neighbors for the formation of the LSDB.
We can have five types of link-state advertisements:
router link, network link, summary link to network, summary link to AS
border router, and external link
Department of Computer Science & Engineering
2. Network link
5. External Link
OSPF Implementation
OSPF is implemented as a program in the network layer, using the service of the
IP for propagation.
An IP datagram that carries a message from OSPF sets the value of the protocol
field to 89.
This means that, although OSPF is a routing protocol to help IP to route its
datagrams inside an AS, the OSPF messages are encapsulated inside datagrams.
OSPF has gone through two versions: version 1 and version 2. Most
implementations use version 2.
Department of Computer Science & Engineering
OSPF Messages
OSPF is a very complex protocol; it uses five different types of messages.
The hello message (type 1) is used by a router to introduce itself to the neighbors
and announce all neighbors that it already knows.
The database description message (type 2) is normally sent in response to the hello
message to allow a newly joined router to acquire the full LSDB.
Department of Computer Science & Engineering
The link state request message (type 3) is sent by a router that needs information
about a specific LS.
The link-state update message (type 4) is the main OSPF message used for
building the LSDB. This message, in fact, has five different versions (router link,
network link, summary link to network, summary link to AS border router, and
external link).
The link-state acknowledgment message (type 5) is used to create reliability in
OSPF; each router that receives a link-state update message needs to
acknowledge it.
Department of Computer Science & Engineering
Authentication
The OSPF common header has the provision for authentication of the message
sender.
This prevents a malicious entity from sending OSPF messages to a router and
causing the router to become part of the routing system to which it actually does
not belong.
OSPF Algorithm
OSPF implements the link-state routing algorithm. However, some changes and
augmentations need to be added to the algorithm:
‣ After each router has created the shortest-path tree, the algorithm needs to use
it to create the corresponding routing algorithm. The algorithm needs to be
augmented to handle sending and receiving all five types of messages
Department of Computer Science & Engineering
Performance
Update Messages. The link-state messages in OSPF have a somewhat complex
format. They also are flooded to the whole area. If the area is large, these
messages may create heavy traffic and use a lot of bandwidth.
Convergence of Forwarding Tables. When the flooding of LSPs is completed,
each router can create its own shortest-path tree and forwarding table;
convergence is fairly quick. However, each router needs to run Dijkstra’s
algorithm, which may take some time.
Robustness. The OSPF protocol is more robust than RIP because, after receiving
the completed LSDB, each router is independent and does not depend on other
routers in the area. Corruption or failure in one router does not affect other routers
as seriously as in RIP.
Department of Computer Science & Engineering
Each autonomous system in this figure uses one of the two common intradomain
protocols, RIP or OSPF.
Each router in each AS knows how to reach a network that is in its own AS, but it
does not know how to reach a network in another AS.
To enable each router to route a packet to any network in the internet, first a
variation of BGP4, called external BGP (eBGP) is installed on each border router
(the one at the edge of each AS which is connected to a router at another AS).
Then the second variation of BGP, called internal BGP (iBGP) is installed on all
routers.
This means that the border routers will be running three routing protocols
(intradomain, eBGP, and iBGP), but other routers are running two protocols
(intradomain and iBGP).
Department of Computer Science & Engineering
The messages exchanged during three eBGP sessions help some routers know how
to route packets to some networks in the internet, but the reachability information
is not complete.
There are two problems that need to be addressed:
1. Some border routers do not know how to route a packet destined for nonneighbor
ASs. For example, R5 does not know how to route packets destined for networks
in AS3 and AS4. Routers R6 and R9 are in the same situation as R5: R6 does not
know about networks in AS2 and AS4; R9 does not know about networks in AS2
and AS3.
2. None of the nonborder routers know how to route a packet destined for any
networks in other ASs. To address the above two problems, we need to allow all
pairs of routers (border or nonborder) to run the second variation of the BGP
protocol, iBGP.
Department of Computer Science & Engineering
The updating process does not stop here. For example, after R1 receives the
update message from R2, it combines the reachability information about AS3
with the reachability information it already knows about AS1 and sends a new
update message to R5.
Now R5 knows how to reach networks in AS1 and AS3. The process continues
when R1 receives the update message from R4.
At a point in time there are no changes in the previous updates and that all
information is propagated through all ASs.
At this time, each router combines the information received from eBGP and
iBGP and creates what we may call a path table after applying the criteria for
finding the best path, including routing policies
Department of Computer Science & Engineering
Figure 3:
Finalized
BGP path
tables
Router R1 now knows that any packet destined for networks N8 or N9 should go
through AS1 and AS2 and the next router to deliver the packet to is router R5.
Similarly, router R4 knows that any packet destined for networks N10, N11, or
N12 should go through AS1 and AS3 and the next router to deliver this packet to is
router R1, and so on.
In the case of a stub AS, the only area border router adds a default entry at the
end of its forwarding table and defines the next router to be the speaker router at
the end of the eBGP connection.
Department of Computer Science & Engineering
In Figure , R5 in AS2 defines R1 as the default router for all networks other than N8 and
N9. The situation is the same for router R9 in AS4 with the default router to be R4. In AS3,
R6 set its default router to be R2, but R7 and R8 set their default router to be R6
In the case of a transient AS, the situation is more complicated. R1 in AS1 needs to inject
the whole contents of the path table for R1 in Figure 3 into its intradomain forwarding
table. The situation is the same for R2, R3, and R4.
Department of Computer Science & Engineering
Address Aggregation
Intradomain forwarding tables obtained with the help of the BGP4 protocols may
become huge in the case of the global Internet because many destination
networks may be included in a forwarding table.
Fortunately, BGP4 uses the prefixes as destination identifiers and allows the
aggregation of these prefixes, For example, prefixes 14.18.20.0/26,
14.18.20.64/26, 14.18.20.128/26, and 14.18.20.192/26, can be combined into
14.18.20.0/24 if all four subnets can be reached through one path.
Even if one or two of the aggregated prefixes need a separate path, the longest
prefix principle allows us to do so.
Department of Computer Science & Engineering
Path Attributes
In both intradomain routing protocols (RIP or OSPF), a destination is normally
associated with two pieces of information: next hop and cost.
The first one shows the address of the next router to deliver the packet; the second
defines the cost to the final destination.
Interdomain routing is more involved and naturally needs more information about
how to reach the final destination.
In BGP these pieces are called path attributes. BGP allows a destination to be
associated with up to seven path attributes.
Path attributes are divided into two broad categories: well-known and optional.
Department of Computer Science & Engineering
The first byte in each attribute defines the four attribute flag.
The next byte defines the type of attributes assigned by ICANN (only seven
types have been assigned).
The attribute value length defines the length of the attribute value field (not the
length of the whole attributes section)
ORIGIN (type 1)
‣ This is a well-known mandatory attribute, which defines the source of the routing
information.
‣ This attribute can be defined by one of the three values: 1, 2, and 3.
‣ Value 1 means that the information about the path has been taken from an
intradomain protocol (RIP or OSPF).
‣ Value 2 means that the information comes from BGP. Value 3 means that it
comes from an unknown source.
Department of Computer Science & Engineering
AS-PATH (type 2)
‣ This is a well-known mandatory attribute, which defines the list of autonomous
systems through which the destination can be reached.
‣ The AS-PATH attribute helps prevent a loop. Whenever an update message
arrives at a router that lists the current AS as the path, the router drops that path.
The AS-PATH can also be used in route selection.
NEXT-HOP (type 3)
‣ This is a well-known mandatory attribute, which defines the next router to which
the data packet should be forwarded.
‣ This attribute helps to inject path information collected through the operations of
eBGP and iBGP into the intradomain routing protocols such as RIP or OSPF.
Department of Computer Science & Engineering
ATOMIC-AGGREGATE (type 6)
‣ This is a well-known discretionary attribute, which defines the destination prefix
as not aggregate; it only defines a single destination network.
‣ This attribute has no value field, which means the value of the length field is
zero.
AGGREGATOR (type 7)
This is an optional transitive attribute, which emphasizes that the destination
prefix is an aggregate.
The attribute value gives the number of the last AS that did the aggregation
followed by the IP address of the router that did so.
Department of Computer Science & Engineering
Route Selection
‣ In the case where multiple
routes are received to a
destination, BGP needs to
select one among them.
‣ The route selection process
in BGP is not as easy as the
ones in the intradomain
routing protocol that is
based on the shortest-path
tree.
‣ A route in BGP has some
attributes attached to it and
it may come from an eBGP
Figure : Flow diagram for route selection
session or an iBGP session.
Department of Computer Science & Engineering
The router extracts the routes which meet the criteria in each step.
If only one route is extracted, it is selected and the process stops; otherwise, the
process continues with the next step.
Note that the first choice is related to the LOCAL-PREF attribute, which reflects
the policy imposed by the administration on the route.
Messages
BGP uses four types of messages for communication between the BGP speakers
across the ASs and inside an AS: open, update, keepalive, and notification
Performance
BGP performance can be compared with RIP. BGP speakers exchange a lot of
messages to create forwarding tables, but BGP is free from loops and count-to-
infinity.
The same weakness we mention for RIP about propagation of failure and
corruption also exists in BGP.
Department of Computer Science & Engineering
Multicasting Routing
In multicasting, there is one source and a group of destinations. The relationship
is one to many.
In this type of communication, the source address is a unicast address, but the
destination address is a group address, a group of one or more destination
networks in which there is at least one member of the group that is interested in
receiving the multicast datagram
Department of Computer Science & Engineering
Multicast Open Shortest Path First (MOSPF) is the extension of the Open
Shortest Path First (OSPF) protocol, which is used in unicast routing.
In unicast link-state routing, each router in the internet has a link-state database
(LSDB) that can be used to create a shortest-path tree.
A router goes through the following steps to forward a multicast packet received
from source S and to be sent to destination G (a group of recipients):
1. The router uses the Dijkstra algorithm to create a shortest-path tree with S as the
root and all destinations in the internet as the leaves.
- This shortest-path tree is different from the one the router normally uses for
unicast forwarding, in which the root of the tree is the router itself.
- In this case, the root of the tree is the source of the packet defined in the
source address of the packet.
- The router is capable of creating this tree because it has the LSDB, the
whole topology of the internet; the Dijkstra algorithm can be used to create
a tree with any root, no matter which router is using it.
- The point we need to remember is that the shortest-path tree created this
way depends on the specific source. For each source we need to create a
different tree.
Department of Computer Science & Engineering
2. The router finds itself in the shortest-path tree created in the first step. In other
words, the router creates a shortest-path subtree with itself as the root of the
subtree.
3. The shortest-path subtree is actually a broadcast subtree with the router as the
root and all networks as the leaves.
- The router now uses a strategy similar to the one in the case of DVMRP to
prune the broadcast tree and to change it to a multicast tree.
- The IGMP protocol is used to find the information at the leaf level.
- MOSPF has added a new type of link state update packet that floods the
membership to all routers.
- The router can use the information it receives in this way and prune the
broadcast tree to make the multicast tree.
Department of Computer Science & Engineering
4. The router can now forward the received packet out of only those interfaces that
correspond to the branches of the multicast tree. We need to make certain that a
copy of the multicast packet reaches all networks that have active members of the
group and that it does not reach those networks that do not.