Transport Layer Protocols: Transmission Control Protocol (TCP)
Transport Layer Protocols: Transmission Control Protocol (TCP)
Transport Layer Protocols
Version 2.0
Transport layer performs two main tasks for the application layer by using the
network layer. It provides end‐to‐end communication between two applications,
and implements some control functions. The original TCP/IP protocol suite (to be
introduced later) introduces two different protocols for this purpose: Transmission
Control Protocol (TCP) and User Datagram Protocol (UDP). TCP establishes a virtual
link between two application programs, and provides error checking and congestion
control. On the other hand, UDP protocol keeps the minimum requirement of an
end‐to‐end connection and leaves the control mechanisms to the application. TCP is
a reliable and connection‐oriented protocol appropriate for applications such as
email delivery and file transfer protocol (FTP). The UDP is an unreliable and
connectionless protocol, but it is simple and incurs less communication overhead.
The UDP is a good choice for applications such as Voice‐over‐IP (VoIP), where
packet loss is acceptable to some extent. In the following section, the TCP and UDP
are introduced.
Transmission Control Protocol (TCP)
TCP is the main protocol of the transport layer in packet switched networks. TCP
and IP protocols provide the basic structure of the Internet. These two protocols are
complementary in the sense that IP facilitates packet forwarding irrespective of
reliable packet delivery, whereas TCP is responsible for reliable packet delivery
between applications. In the network layer, the order of packets during
transmission may change, because, different packets may reach a destination on
different routes. Moreover, some packets may be dropped due to the congestion
and/or long delay in a node on the path. TCP protocol compensates for some of
these deficiencies of the IP protocol, and controls the congestion in a network.
1
To implement reliable services, TCP deploys a numbering system as well as some
control mechanisms. The numbering system determines byte, sequence, and
acknowledgment numbers. The control mechanisms include flow, error, and
congestion controls. The bytes of the data part of TCP segments are numbered by
the protocol. In fact, the first byte of a data segment gets a sequence number
explicitly, where as the following bytes get it implicitly. The number is put in the
sequence number field of the segment header, and is used for flow and error control.
In addition, the parties involved in a connection use the {byte number + 1} in their
segment header as acknowledgment number. It indicates the byte number they
expect to receive. For example, if the acknowledgement number of a segment
header of one of the communication parties shows 2304, that means the party has
received all of the bytes from the beginning up to and including byte number 2303,
and now it is expecting to receive byte number 2304. TCP connections are full
duplex; each party has its own sequence and acknowledgment numbers for the
transmitting stream.
Flow control avoids data overflow at the receiver. In other words, the receiver tells
the sender how much to send so that buffer overflow is avoided. Congestion control
is implemented by the sender based on the level of congestion in a network. Error
control mechanism provides a reliable service for TCP. The control mechanisms of
TCP will be explained later on. Prior to that, TCP protocol suite is introduced to
show the relevance between TCP and the upper and lower layer protocols. Also, the
structure of a TCP segment and the concept of connection oriented TCP are
described to help understanding the TCP control mechanisms.
TCP/IP Protocol Suite
TCP and IP protocols were introduced prior to the OSI (Open System
Interconnection) model of the International Standards Organization. Therefore, the
2
protocol stack based on the TCP/IP, as shown in Fig.1, does not exactly match with
the 7‐layer OSI model. The TCP/IP protocol suite contains four layers of protocols
(or five layers, depending on which reference you look at!) as defined below.
Layer 4 ‐ Application Layer: This layer performs the functionalities of application,
presentation and session layers of the OSI model. Application protocols such as
SMTP (simple mail transfer protocol), FTP (file transfer protocol), and HTTP
(hypertext transfer protocol) operate at this layer.
Layer 3 ‐ Transport Layer: This layer performs the equivalent functions of the OSI
transport layer as well as some OSI session layer functionalities. This layer provides
different levels of reliable delivery service for the application layer depending on
the application requirements.
Layer 2 ‐ Internet Layer: This layer handles routing tables and packet forwarding.
Layer 1 ‐ Network Access Layer (Data Link + MAC + Physical): Connection to the
physical environment and flow control are performed at this layer. The layer
specifies the characteristics of the hardware and access methods.
OSI TCP/IP
Application
Application
Presentation
Session
Transport
Transport
Network Internet
Data Link
Network Interface
Physical
Fig. 1 – OSI model versus TCP/IP model
3
TCP Segment
In order to understand the TCP protocol, it is necessary to understand the various
fields of a TCP segment header. These fields are described in this section. In the next
sections, we describe how each field is utilized for each TCP service. The TCP
Segment header is 20 bytes long, but it may increase up to 60 bytes if an optional 40
byte is added to the header. The format of a segment is shown in Fig. 2. The meaning
and purpose of each field is as follows:
Source Port Address: A 16‐bit address of the port number of the sender
(application).
Destination port address: A 16‐bit address of the port number of the receiver
(application).
Sequence Number: A 32‐bit address (sequence number) of first byte of data in each
segment. In other words, the sequence number indicates the address of the first
data byte in each segment for the receiver.
Acknowledgment Number: A 32‐bit byte number that the receiver expects to receive
from the sender. An acknowledgment number, 1023, means that all the bytes
starting from the initial sequence number and up to and including byte number
1022 have been successfully received.
Header length: A 4‐bit field that represents the header length in multiple of four
bytes.
Reserved: A 6‐bit field reserved for the requirements of future protocol
developments.
4
Control: A 6‐bit field contains some flags for connection establishment, termination,
abortion, flow control, and data transfer mode indication. Table 1 represents the
usage of each flag.
H
Acknow ledgment Number
e
H eader Reserved U A P R S F W indow size a
Length d
Checksum Urgent Pointer e
r
Options Padding
Data
Fig. 2: TCP Segment Format
Table 1‐ TCP Segment Header Flags
Flag value Description
URG 1 Urgent pointer field has a valid value.
ACK 1 Acknowledgment field has a valid value.
The receiving TCP module passes (pushes)
1
the data to the application immediately.
PSH
The receiving TCP module may delay the
0
data.
The connection request is denied, or the
RST 1
connection is aborted.
Synchronize sequence number during
SYN 1
connection
Sender has no more data to send, but is
FIN 1
ready to receive.
5
Window Size: A 16‐bit number specifying the number of bytes the receiver is willing
to receive. The value of this field is used in flow control and congestion control.
Checksum: A 16‐bit field used for error correction.
Urgent pointer: A 16‐bit number that is added to the sequence number and shows
the location of the first byte of urgent data in the data section of the segment. Urgent
data is delivered to the recipient application program to be processed before other
data. The value of this field is used when the URG flag is on.
Options: The length of this field may reach up to 40 bytes.
Connection oriented TCP
TCP is a connection oriented protocol. It establishes a virtual connection between
the running application programs (process) at the two communication end points.
TCP uses port number for this purpose. Each port number is relevant to an
application as shown in Table 2. The stream of data is sent from the process to the
transport layer. TCP establishes a connection between the transport layers of the
transmitter and receiver stations. It then divides the stream of data into units called
segments. Segments are numbered and transmitted one by one. The TCP protocol at
the receiver side checks the arriving segments for error, loss, and duplication. It
orders the segments, makes a stream, and transfers the stream to the receiver side
process. TCP provides stream delivery service. Both the sender and the receiver
processes deal with the stream of bytes and are not aware of stream segmentation
in the lower layers. After all segments are transmitted, TCP at the transmitter side
closes the connection.
6
Table 2‐ Port Numbers Used by TCP
Port Protocol Description
20 FTP, Data File Transfer Protocol (data connection)
File Transfer Protocol (control
21 FTP, Control
connection)
23 TELNET Terminal Network
25 SMTP Simple Mail Transfer Protocol
53 DNS Domain Name Server
79 Finger Finger
80 HTTP Hypertext Transfer Protocol
111 RPC Remote Procedure Call
Connection oriented TCP consists of connection establishment, data transfer, and
connection termination phases. The parties involved in a connection establishment
are termed as client, applicant for the connection, and server, the destination.
Connection establishment is a three‐way handshaking protocol. Fig. 3 shows a
connection establishment. First, the client sends a segment with SYN=1 to
synchronize the sequence numbers. The segment contains the initial sequence
number. Second, the server sends the second segment with SYN and ACK flags on.
That means the segment contains the initial sequence number of the server and the
first segment transmitted by the client has been successfully received. The server’s
segment contains the value of the window size kept by the server. In the next
sections we will describe the application of the window size. Third, the client sends
the segment with ACK flag on which acknowledges the reception of the server
segment reception. The window size of the client is included too. Note that the
sequence number of the client’s acknowledgment segment is the same as client’s
first segment (synchronizing segment), because no data is carried in the first
segment and sequence number does not change.
7
Data transfer phase is bidirectional. It includes data and acknowledgement
transmission. The acknowledgment is piggybacked with data or transmitted
individually. Fig. 4 shows an example of data transfer. The client and the server have
2000 bytes of data (each one) to transmit. The client transmits the data in two
segments, and the server transmits its data all at once. Acknowledgment is piggy
backed with the three first segments. However, there is no more data to be
transmitted at the last segment, and it carries only the acknowledgement. In the
client’s data segments, the PSH flag is on. It means the server TCP should deliver
data to server application as soon as they are received (appropriate for interactive
Client Server
Seq# = 8000
, SYN=1
01,
0, Ack = 80 00
Seq#=1500 N D = 50
=1, RW
SYN=ACK
S eq # = 8 0
00, Ack =
ACK=1, 15001,
RW ND
= 10000
Time
Time
applications).
Fig. 3 – Three‐way connection establishment
Client or server can terminate a connection, no matter which one has established it.
Also, one party can terminate its connection, while it still receives data from the
other party. The termination phase is implemented by terminating party which
transmits a segment contains FIN flag on, with the last available chunk of data. The
8
recipient party acknowledges the connection termination by activating the ACK flag
in its next segment.
Client Server
Seq# = 8001
, Ack=15001
Data bytes: , ACK=PSH
8001-9000 =1
Seq# = 9001
, Ack=15001
Data bytes: , ACK=PSH
9001-10000 =1
1
001, ACK=
1, Ack = 10
Seq#=1500 15001- 17 00 0
Data bytes:
Seq#=10
000, Ack
ACK=1, = 17
RW ND = 0 0 1 ,
10 000
Time
Time
Fig. 4 – Data transfer
Flow Control
Flow control determines how much data can be transmitted before receiving an
acknowledgment. There are two extremes for this purpose. One asks an
acknowledgment after each one byte of data transmission, which causes delay and
transmission overhead; the other asks for only one acknowledgment after
transmission all of the data, which causes late feedback to the transmitter to
compensate for transmission problems. For instance, if the path is congested,
transmitter will not be aware of that to slow down the transmission rate. TCP flow
control chooses a dynamic approach between these two extremes.
9
TCP flow control is window‐based. Window size specifies the amount of data that
can be transmitted before receiving the acknowledgment. Window slides over the
segments as shown in Fig 5. The size of the window varies according to the
congestion status of a network, congestion window (CWND), and the receiver buffer
size, receiver window (RWND). The congestion window size is determined by the
network to avoid congestion. The receiver window indicates the number of bytes
that the receiver can accept before its buffer overflows. The value of the receiver
window is sent to the transmitter in the acknowledgment message. The size of the
sliding window, shown in Fig. 5, is equal to the min (CWND, RWND). Accordingly,
the sliding window resizes by opening from the right and closing from the left.
Opening lets more bytes come to the window and closing lets the acknowledged
bytes to go out of the window.
S liding window
Byte stream
Fig. 5 – TCP sliding window
Example: In the example, shown in Fig. 6, the CWND is 20 and RWND is 9 bytes.
Accordingly, TCP window size equals to 9. The bytes up to 2002 have been sent.
Bytes number 2000, 2001, and 2002 has not been acknowledged by the receiver,
but the sender can still transmit bytes 2003 up to 2008.
10
N ext b yte to be sent
1 2 2 2 2 2 2 2 2 2 2
9 0 0 0 0 0 0 0 0 0 0
9 0 0 0 0 0 0 0 0 0 0
9 0 1 2 3 4 5 6 7 8 9
Fig. 6 ‐ Sliding Window
If the sender receives an acknowledgment value for byte 2003 along with RWND =
9, the window will be updated. It is closed from the left and opened from the right.
The new TCP window will contain bytes 2003 up to 2011.
Question: What happens if the sender or transmitter is very slow in a sliding
window mechanism?
Answer: The Silly Window Syndrome occurs! The slow transaction of data causes
the window size to be reduced down to one byte. Comparing the overhead of the
segment, 40 bytes, with the body, one byte, shows flow control with sliding window
mechanism is not efficient at all.
Remedy: If the slowness of the sender’s application causes Silly Window Syndrome,
Nagle’s Algorithm is the remedy. The prescription is as follows:
• Sender sends the first segment even if it is a small one
• Next, the sender waits until an ACK is received, or a maximum‐size segment
is accumulated.
11
• Step 2 is repeated for the rest of the transmission.
If the slowness of the receiver’s application causes Silly Window Syndrome, two
prescriptions are advised:
Clark’s solution: Send an ACK as soon as the data arrives, and close the window until
another segment can be received or buffer is ½ empty.
Delayed ACK: Delay sending the ACK, at most 500 ms; this causes the sender stop
transmitting, but it does not change its window size.
Error Control
The reliability of TCP protocol is achieved by error control mechanisms that detect
and correct errors. Error detection includes identifying corrupted, lost, out‐of‐
order, and duplicated segments. TCP deploys three tools for error detection and
correction: checksum, acknowledgement, and time‐out.
Checksum
Checksum is a simple method for detecting corrupted segments. Each segment
contains a 16‐bit field in its header for checksum. Destination TCP discards
segments that have checksum error.
Acknowledgement
Every data or control segment with sequence number is acknowledged to confirm
the segment reception. There are two kinds of acknowledgement: positive
acknowledgement (shortly ACK) and selective acknowledgement (SACK). Positive
acknowledgement symbolizes the fact that the receiver advertises the expecting
byte number and ignores reporting the erroneous segments. SACK is a
12
complementary mechanism for ACK that reports erroneous segments, such as
duplicated and out‐of‐order segments, to the sender.
Timeout Retransmission
This is a very important task of error control mechanism. Corrupted, lost, and
delayed segments are retransmitted. The criteria for the retransmission are
retransmission timer expiration or three duplicate ACK receptions.
The TCP sender starts a retransmission time‐out (RTO) timer when it sends a
segment. The segment is retransmitted upon RTO timer expiration, no matter what
the reason is for not receiving ACK. For example if the segment is received
successfully, but the ACK is lost or delayed, the TCP sender assumes segment has
been lost. For this reason, the RTO timer is set for a longer time in networks with
long round trip time (RTT). RTT is the time for a segment to catch the receiver plus
the time for an ACK to reach to the segment transmitter.
Network that needs fast retransmission, three duplicate ACK is the rule for
retransmission. If the sender receives three ACKs for a sequence number, it sends
the segment with that sequence number immediately, irrespective of the RTO timer
value. The advantage of three duplicate ACK rule is the receiver does not need to
buffer many segments until it finally receives an out‐of‐order segment.
Congestion Control
In a network, data are queued in the buffer of the interfaces and delivered to the
next stage in an appropriate time. If there is a mismatch between processing time or
capacity of the two interconnected stages of the network, congestion occurs.
13
Congestion affects two performance metrics in a network: throughput and delay.
Delay can be considered as the sum of the propagation, processing, and queuing
delay. Indeed, congestion affects queuing delay much more than the others. When
the network load increases, congestion deteriorates the delay performance. This
situation is depicted in Fig. 7. Again, increased delay worsens the congestion state.
Because the sender attempts to retransmit a packet upon late ACK, which increases
the network load further. Throughput and goodput are two performance measures
that show the behaviour of the network versus load. Throughput is defined as the
number of packets passing through the network in a unit of time. The measure of
the received packets to the transmitted packets named goodput. As shown in Fig. 7,
when the network load increases up to the network capacity, the throughput and
goodput increase. However, when the traffic load is more than the network capacity,
overflow occurs in the buffers and packets are dropped.
Total Output rate
Delay
No congestion
congestion
No
congestion
Network capacity
congestion
Network capacity
Fig. 7‐ Network performance upon congestion
The principles of congestion control mechanisms are to prevent or remove
congestion. Congestion prevention mechanisms control congestion by adopting a
good retransmission, acknowledgement, and discard policy. Once congestion
occurred, different signalling mechanisms are used by the routers to inform the
sender or receiver of the congestion to compensate for that.
Congestion can be avoided by appropriately adjusting the retransmission timers
and choosing a retransmission policy. The time that a receiver sends an
14
acknowledgement affects the sender’s transmission speed. So, by controlling this
time, congestion can be controlled. For instance, if the receiver sends the
acknowledgements later, but before the expiration of the sender’s timer, the arrival
traffic in a network slows down. Discard policy can prevent congestion by
discarding the packets that have less effect on the transmission or application
quality, e.g. discarding less important voice or video packets instead of an FTP
session data packets. The above preventing mechanisms are open loop control
mechanisms and can be implemented at sender or receiver side.
To alleviate an occurred congestion, closed loop control mechanisms are used to
slow down the arrival of traffic to a network. When a router experience congestion,
it may ask the previous routers to slow down their packet transmissions. Inform
directly senders or receivers, instead of routers on a path, to adjust the injected
amount of data to the network could be another mechanism. In other words,
congestion is controlled by TCP senders and receivers, not routers on a path. The
main tool of closed loop congestion control mechanisms is window size. The
window size is determined by the minimum value of available capacity of buffers in
TCP receiver (RWND) or congestion level in the network (CWND). RWND is
advertised from the receiver side and used for flow control. CWND size is
determined on sender side and used for congestion control.
TCP Congestion Policy
TCP congestion control mechanism has three instruments: slow start, congestion
avoidance, and congestion detection. A sender starts transmission with a slow rate
and increases the rate as long as its rate is below a threshold. When the threshold is
reached, the rate is decreased to avoid congestion. If congestion happens, the sender
slows down the rate to the slow start rate or another rate of congestion avoidance
phase depending on the chosen policy.
15
In slow start phase, a sender chooses the minimum value of CWND which has been
agreed upon connection establishment. Suppose RWND is much higher than the
CWND. After transmitting the first segment and receiving the corresponding ACK,
sender doubles CWND. That means two segments can be transmitted before
receiving an acknowledgement. If the sender receives ACK for the two transmitted
segments, it doubles CWND again, as shown in Fig. 8. The process of increasing
CWND exponentially stops and congestion avoidance phase starts when CWND size
reaches a threshold value. CWND size increases in congestion avoidance phase
linearly, not exponentially, as long as no congestion is detected. That means the
sender increases CWND by one after receiving ACK, as shown in Fig. 9. The slow
increment of CWND prevents congestion to some extent, but it can not 100% avoid
that.
Sender Receiver
cwnd=1 Segment 1
Segment 3
Segment 4
Segment 5
Segment 6
Segment 7
Time Time
Fig. 8‐ slow start phase
A TCP sender assumes congestion happened upon RTO expiration or receiving three
ACKs for a segment. A new slow start or congestion avoidance phase follows the
congestion detection phase. If detection is based on time‐out, the threshold value
will be set to the half of the current window size and CWND will be set by the
minimum CWND size of the slow start phase. Otherwise, if detection is by three
16
ACKs, the threshold value will be set to the half of the current window size, CWND
will be set by the threshold value, and a congestion avoidance phase starts. The
different reaction to congestion detection is because of the fact that there is a
stronger possibility of congestion when a time‐out occurs than three ACKs occur. In
the former, there is no idea about the transmitted segments, but the later means
some segments have reached the receiver successfully.
Sender Receiver
cwnd=1 Segment 1
Segment 3
Segment 4
Segment 5
Segment 6
Time Time
Fig. 9‐ congestion avoidance phase
In summary, the congestion control mechanism of TCP can be categorized in three
phases. It starts with slow start phase that exponentially increase CWND and
continues to congestion avoidance threshold. When it reaches the congestion
avoidance threshold then it additively increases CWND, and follows by a
multiplicative decreasing rate phase upon congestion detection. Fig. 10 illustrates
the three phases.
17
cwnd
Time-out
3-ACKs
multiplicative
multiplicative
decrease
decrease
slow start
slow start
increase
increase
increase
additive
additive
additive
time
Fig. 10‐ TCP congestion control behaviour
User datagram protocol (UDP)
UDP serves the application layer and network layer like TCP, but with more ease.
Table 3 compares the two protocols from the viewpoint of services they provide to
the transport layer. UDP uses port number for application to application
communication, but it does not establish a connection as TCP does, so it is termed as
connectionless. UDP sender transmits the data unit of the upper layer process and
does not care if the transmission is reliable. In other words, if the data unit get lost
due to the congestion or duplicated in the path, UDP does not recognize it. UDP does
not deploy ACK, flow and congestion control. That’s why UDP is unreliable. There is
minimal error detection, and erroneous packets will be simply discarded. The
beauty of UDP is its simplicity. The only overhead that UDP adds to the packet is to
establish process to process communication, rather than a host to host
communication of IP layer. Therefore, UDP is very suitable for small message
communications or applications that do not need strong reliability, such as
client/server request/reply or video conferencing. It can also be used in
18
applications that have internal error and flow control, such as Trivial File Transfer
Protocol (TFTP).
Table 3‐ TCP and UDP services comparison
Service TCP UDP
Using port number for process to process yes yes
communication
Establishing process to process connection yes no
Increasing overhead & interaction b/w sender & yes no
receiver
Doing error control yes minimal
Doing flow control yes no
Sending ACK for received packet yes no
The data unit of UDP consists of an 8‐byte header and variable size data field as
shown in Fig. 11. The header consists of the source and destination port number,
total length, and checksum fields.
S o u rc e P o rt D e s tin a tio n P o r t
N um ber N um ber
1 6 b its 1 6 b its
T o ta l L e n g th C hecksum
1 6 b its 1 6 b its
D a ta
Fig. 11: User Datagram Format
19
UDP port numbers are same as the TCP port numbers. They allow different
applications to maintain a specific path for their data. In other words, multiple
applications can be distinguished with their port numbers. You can think of a port
number as an associated number to an interface queue that the application in
sender or receiver side keeps data in it before transmission or reception. Even if an
application wants to communicate with multiple applications, it keeps one port
number for all incoming and outgoing data. Port numbers can be reserved for an
application.
The total length field shows the UDP datagram size which is the sum of bytes
contained in the header and data section (sometimes named payload). The
maximum size of a datagram is 65535 bytes.
The use of the checksum field is optional in UDP, unlike TCP. If the checksum is not
calculated by the sender the field will be filled with zero.
20