Module 4 - Transport Layer - Lecture Notes
Module 4 - Transport Layer - Lecture Notes
1 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
However, certain services can be offered by a transport protocol even when the
underlying network protocol does not offer the corresponding service at the network layer.
For example, a transport protocol can offer reliable data transfer service to an application
even when the underlying network protocol is unreliable, that is, even when the network
protocol loses, garbles, or duplicates packets.
A state diagram for connection establishment and release for these simple primitives
is given in Fig. 4-3.
2 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
3 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
4 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
Symmetric release does the job when each process has a fixed amount of data to send
and clearly knows when it has sent it. In other situations, determining that all the work has
been done and the connection should be terminated is not so obvious. One can envision a
protocol in which host 1 says ‘‘I am done. Are you done too?’’ If host 2 responds: ‘‘I am done
too. Goodbye, the connection can be safely released.’’
In Figure 4-8(a), we see the normal case in which one of the users sends a DR
(DISCONNECTION REQUEST) segment to initiate the connection release. When it arrives,
the recipient sends back a DR segment and starts a timer, just in case its DR is lost. When
this DR arrives, the original sender sends back an ACK segment and releases the connection.
Finally, when the ACK segment arrives, the receiver also releases the connection.
If the final ACK segment is lost, as shown in Figure 4-8(b), the situation is saved by
the timer. When the timer expires, the connection is released anyway. Now consider the case
of the second DR being lost. The user initiating the disconnection will not receive the
expected response, will time out, and will start all over again.
In Figure 4-8(c), we see how this works, assuming that the second time no segments
are lost and all segments are delivered correctly and on time. Our last scenario, Figure 4-
8(d), is the same as Figure C except that now we assume all the repeated attempts to
5 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
retransmit the DR also fail due to lost segments. After N retries, the sender just gives up and
releases the connection. Meanwhile, the receiver times out and exits.
Figure 4-8. Four protocol scenarios for releasing a connection. (a) Normal
case of three-way handshake. (b) Final ACK lost. (c) Response lost. (d) Response lost and subsequent DRs lost.
6 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
4.2.5. Multiplexing
Multiplexing, or sharing several conversations over connections, virtual circuits, and
physical links plays a role in several layers of the network architecture. In the transport
layer, the need for multiplexing can arise in several ways.
For example, if only one network address is available on a host, all transport
connections on that machine must use it. When a segment comes in, some way is needed to
tell which process to give it to. This situation, called multiplexing, is shown in Figure 4-9(a).
In this figure, four distinct transport connections all use the same network connection (e.g.,
IP address) to the remote host.
Multiplexing can also be useful in the transport layer for another reason. Suppose,
for example, that a host has multiple network paths that it can use. If a user needs more
bandwidth or more reliability than one of the network paths can provide, a way out is to
have a connection that distributes the traffic among multiple network paths on a round-
robin basis, as indicated in Figure 4-9(b). This method is called inverse multiplexing. With k
network connections open, the effective bandwidth might be increased by a factor of k. An
example of inverse multiplexing is SCTP (Stream Control Transmission Protocol), which can
run a connection using multiple network interfaces.
Based on only this state information, the client must decide whether to retransmit
the most recent segment. If a crash occurs after the acknowledgement has been sent but
before the write has been fully completed, the client will receive the acknowledgement and
thus be in state S0 when the crash recovery announcement arrives. The client will therefore
not retransmit.
At this point you may be thinking: ‘‘That problem can be solved easily. All you must
do is reprogram the transport entity to first do the write and then send the
acknowledgement.’’ Try again. Imagine that the write has been done but the crash occurs
before the acknowledgement can be sent. The client will be in state S1 and thus retransmit,
leading to an undetected duplicate segment in the output stream to the server application
process.
No matter how the client and server are programmed, there are always situations
where the protocol fails to recover properly. The server can be programmed in one of two
ways: acknowledge first or write first. The client can be programmed in one of four ways:
always retransmit the last segment, never retransmit the last segment, retransmit only in
state S0, or retransmit only in state S1. This gives eight combinations, but as we shall see,
for each combination there is some set of events that makes the protocol fail.
Three events are possible at the server: sending an acknowledgement (A), writing to
the output process (W), and crashing (C). The three events can occur in six different
orderings: AC(W), AWC, C(AW), C(WA), WAC, and WC(A), where the parentheses are used
to indicate that neither A nor W can follow C (i.e., once it has crashed, it has crashed). Figure
4-10 shows all eight combinations of client and server strategies and the valid event
sequences for each one.
8 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
the network by the transport layer. The only effective way to control congestion is for the
transport protocols to send packets into the network more slowly.
9 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
flows without decreasing the capacity of another, lower flow. For example, if more of the
bandwidth on the link between R2 and R3 is given to flow B, there will be less for flow A.
This is reasonable as flow A already has more bandwidth. However, the capacity of flow C
or D (or both) must be decreased to give more bandwidth to B, and these flows will have
less bandwidth than B. Thus, the allocation is max-min fair.
Convergence
A final criterion is that the congestion control algorithm converges quickly to a fair and
efficient allocation of bandwidth. The discussion of the desirable operating point above
assumes a static network environment. However, connections are always coming and going
in a network, and the bandwidth needed by a given connection will vary over time too, for
example, as a user browses Web pages and occasionally downloads large videos.
Because of the variation in demand, the ideal operating point for the network varies
over time. A good congestion control algorithm should rapidly converge to the ideal
operating point, and it should track that point as it changes over time. If the convergence is
too slow, the algorithm will never be close to the changing operating point. If the algorithm
is not stable, it may fail to converge to the right point in some cases, or even oscillate around
the right point.
The way that a transport protocol should regulate the sending rate depends on the
form of the feedback returned by the network. Different network layers may return different
10 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
kinds of feedback. The feedback may be explicit or implicit, and it may be precise or
imprecise.
Figure 4-12. (a) A fast network feeding a low-capacity receiver. (b) A slow network feeding a high-capacity receiver.
An example of an explicit, precise design is when routers tell the sources the rate at
which they may send. Designs in the literature such as XCP (eXplicit Congestion Protocol)
operate in this manner. An explicit, imprecise design is the use of ECN (Explicit Congestion
Notification) with TCP. In this design, routers set bits on packets that experience congestion
to warn the senders to slow down, but they do not tell them how much to slow down.
In other designs, there is no explicit signal. FAST TCP measures the roundtrip delay
and uses that metric as a signal to avoid congestion. Finally, in the form of congestion control
most prevalent in the Internet today, TCP with drop-tail or RED routers, packet loss is
inferred and used to signal that the network has become congested.
There are many variants of this form of TCP, including CUBIC TCP, which is used in
Linux. Combinations are also possible. For example, Windows includes Compound TCP that
uses both packet loss and delay as feedback signals. These designs are summarized in Figure
4-13.
Chiu and Jain (1989) studied the case of binary congestion feedback and concluded
that AIMD (Additive Increase Multiplicative Decrease) is the appropriate control law (the
way in which the rates are increased or decreased) to arrive at the efficient and fair
operating point.
11 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
To argue this case, they constructed a graphical argument for the simple case of two
connections competing for the bandwidth of a single link. The graph in Figure 4-14 shows
the bandwidth allocated to user 1 on the x-axis and to user 2 on the y-axis.
When the allocation is fair, both users will receive the same amount of bandwidth.
This is shown by the dotted fairness line. When the allocations sum to 100%, the capacity of
the link, the allocation is efficient. This is shown by the dotted efficiency line. A congestion
signal is given by the network to both users when the sum of their allocations crosses this
line. The intersection of these lines is the desired operating point, when both users have the
same bandwidth and all the network bandwidth is used.
Now consider the case that the users additively increase their bandwidth allocations
and then multiplicatively decrease them when congestion is signaled. This behavior is the
AIMD control law, and it is shown in Figure 4-15. The path traced by this behavior does
converge to the optimal point that is both fair and efficient. This convergence happens no
matter what the starting point, making AIMD broadly useful.
• The two ports serve to identify the endpoints within the source and destination
machines.
• The UDP length field includes the 8-byte header and the data. The minimum length
is 8 bytes, to cover the header. The maximum length is 65,515 bytes, which is lower
than the largest number that will fit in 16 bits because of the size limit on IP packets.
• An optional Checksum is also provided for extra reliability. Its checksums the header,
the data, and a conceptual IP pseudo header.
4.4.2. Remote Procedure Call
When a process on machine 1 calls a procedure on machine 2, the calling process on 1 is
suspended and execution of the called procedure takes place on 2. Information can be
transported from the caller to the callee in the parameters and can come back in the
procedure result. No message passing is visible to the application programmer. This
technique is known as RPC (Remote Procedure Call) and has become the basis for many
networking applications. Traditionally, the calling procedure is known as the client and the
called procedure is known as the server.
The idea behind RPC is to make a remote procedure call look as much as possible like
a local one. In the simplest form, to call a remote procedure, the client program must be
bound with a small library procedure, called the client stub, that represents the server
procedure in the client’s address space. Similarly, the server is bound with a procedure
called the server stub. These procedures hide the fact that the procedure call from the client
to the server is not local.
The actual steps in making an RPC are shown in Fig. 4-17. Step 1 is the client calling
the client stub. This call is a local procedure call, with the parameters pushed onto the stack
in the normal way. Step 2 is the client stub packing the parameters into a message and
making a system call to send the message. Packing the parameters is called marshaling. Step
3 is the operating system sending the message from the client machine to the server
machine. Step 4 is the operating system passing the incoming packet to the server stub.
Finally, step 5 is the server stub calling the server procedure with the unmarshaled
parameters. The reply traces the same path in the other direction
In terms of transport layer protocols, UDP is a good base on which to implement RPC.
Both requests and replies may be sent as a single UDP packet in the simplest case and the
operation can be fast. DNS requests and replies are also sent through UDP.
13 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
Figure 4-17. Steps in making a remote procedure call. The stubs are shaded.
Because RTP just uses normal UDP, its packets are not treated specially by the
routers unless some normal IP quality-of-service features are enabled. There are no special
guarantees about delivery, and packets may be lost, delayed, corrupted, etc. The RTP format
contains several features to help receivers work with multimedia information.
1. Each packet sent in an RTP stream is given a number one higher than its predecessor.
This numbering allows the destination to determine if any packets are missing. If a
packet is missing, the best action for the destination to take is up to the application. It
may be to skip a video frame if the packets are carrying video data, or to approximate
the missing value by interpolation if the packets are carrying audio data.
2. Each RTP payload may contain multiple samples, and they may be coded any way that
the application wants. For example, a single audio stream may be encoded as 8-bit PCM
samples at 8 kHz using delta encoding, predictive encoding, GSM encoding, MP3
encoding, and so on. RTP provides a header field in which the source can specify the
encoding.
3. Another facility many real-time applications need is timestamping. The idea here is to
allow the source to associate a timestamp with the first sample in each packet. This
mechanism allows the destination to do a small amount of buffering and play each
sample the right number of milliseconds after the start of the stream, independently of
when the packet containing the sample arrived.
14 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
The RTP header is illustrated in Fig. 4-18. It consists of three 32-bit words and potentially
some extensions.
15 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
16 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
17 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
18 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
• The ACK bit is set to 1 to indicate that the Acknowledgement number is valid. If ACK
is 0, the segment does not contain an acknowledgement, so the Acknowledgement
number field is ignored.
• The PSH bit indicates PUSHed data. The receiver is hereby kindly requested to deliver
the data to the application upon arrival and not buffer it until a full buffer has been
received. The RST bit is used to abruptly reset a connection that has become confused
due to a host crash or some other reason.
• The SYN bit is used to establish connections. The connection request has SYN = 1 and
ACK = 0 to indicate that the piggyback acknowledgement field is not in use. The
connection reply does bear an acknowledgement, however, so it has SYN = 1 and
ACK = 1.
• The FIN bit is used to release a connection. It specifies that the sender has no more
data to transmit.
• Flow control in TCP is handled using a variable-sized sliding window. The Window
size field tells how many bytes may be sent starting at the byte acknowledged.
• A Checksum is also provided for extra reliability. It checksums the header, the data,
and a conceptual pseudo header in the same way as UDP.
• The Options field provides a way to add extra facilities not covered by the regular
header.
19 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
data. The CONNECT primitive sends a TCP segment with the SYN bit on and ACK bit off and
waits for a response.
When this segment arrives at the destination, the TCP entity there checks to see if
there is a process that has done a LISTEN on the port given in the Destination port field. If
not, it sends a reply with the RST bit on to reject the connection. The sequence of TCP
segments sent in the normal case is shown in Fig. 4-21(a). If two hosts simultaneously
attempt to establish a connection between the same two sockets, the sequence of events is
as illustrated in Fig. 4-21(b). The result of these events is that just one connection is
established, not two, because connections are identified by their end points.
Figure 4-21. (a) TCP connection establishment in the normal case. (b) Simultaneous connection
establishment on both sides.
Each connection starts in the CLOSED state. It leaves that state when it does either a
passive open (LISTEN) or an active open (CONNECT). If the other side does the opposite
one, a connection is established and the state becomes ESTABLISHED. Connection release
can be initiated by either side. When it is complete, the state returns to CLOSED.
20 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology
COMPUTER NETWORKS 21CS52
Figure 4-22. The states used in the TCP connection management finite state machine.
The finite state machine itself is shown in Figure 4-23. The common case of a client actively
connecting to a passive server is shown with heavy lines—solid for the client, dotted for the
server. The lightface lines are unusual event sequences. Each line in Figure 4-23 is marked
by an event/action pair. The event can either be a user-initiated system call (CONNECT,
LISTEN, SEND, or CLOSE), a segment arrival (SYN, FIN, ACK, or RST), or, in one case, a
timeout of twice the maximum packet lifetime. The action is the sending of a control segment
(SYN, FIN, or RST) or nothing, indicated by —. Comments are shown in parentheses.
Figure 4-23. TCP connection management finite state machine. The heavy solid line is the normal path for a client. The
heavy dashed line is the normal path for a server. The light lines are unusual events. Each transition is labeled with the
event causing it and the action resulting from it, separated by a slash.
21 Dr. Kala Venugopal, Associate Professor, Dept. of ISE, Acharya Institute of Technology