Chapter 3 Transport Layer
Chapter 3 Transport Layer
Transport Layer
▪ provide logical
transport
network
communication between
data link
physical
lo tra
▪ transport protocols run in
gi
ca nsp
end systems
l e or
nd t
• send side: breaks app
-e
nd
messages into segments,
passes to network layer
• rcv side: reassembles
application
transport
• flow control
physical
network
lo
gi ran
data link
• connection setup
ca sp
physical
t
l e or
nd t
▪ unreliable, unordered
network
-e
data link
nd
physical
• no-frills extension of
physical
network
data link
“best-effort” IP
application
physical transport
network
data link network
• delay guarantees
• bandwidth guarantees
application
application P P P application
P 4 5 6 P P
3 2 3
transport
transport transport
network
network link network
link physical link
physical physical
server: IP
address B
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
su 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
m
checksu 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
m
sen receiv
d e
side side
sende receive
r r
Transport Layer 3-26
rdt2.0: channel with bit errors
▪ underlying channel may flip bits in packet
• checksum to detect bit errors
▪ the question: how to recover from errors:
• acknowledgements (ACKs): receiver explicitly tells
sender that pkt received OK
• negative acknowledgements (NAKs): receiver explicitly
tells sender that pkt had errors
• sender retransmits pkt on receipt of NAK
How do humans recover from
▪ new mechanisms in rdt2.0 (beyond rdt1.0):
• error detection “errors”
>sender
during
• receiver feedback: conversation?
control msgs (ACK,NAK) rcvr-
r
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
(b) packet
loss Transport Layer 3-40
rdt3.0 in action
sende receiver
sende receiver r
send pkt0 pkt
r
send pkt0 0 rcv
pkt
ack send
pkt0
0 rcv
send rcv ack0 0 ack0
ack pkt0 send pkt1 pkt
rcv ack0 0 ack0 1 rcv
send pkt1 pkt
1 rcv send
pkt1
ack ack1 ack1
send
pkt1
X 1
los ack1 timeout
s resend pkt
rcv
timeout
resend pkt rcv pkt1
ack1 1
pkt (detect
pkt1
rcv send pkt0 send
pkt1 1 (detect ack 0
duplicate)
ack pkt1 rcv
ack1
send
duplicate) rcv ack1 1 ack
rcv ack1 1 send pkt0 send
pkt0
send pkt0 pkt ack1 0 pkt ack0
rcv
0 rcv ack 0 (detect
ack send pkt0
pkt0 0 send
duplicate)
0 ack0 ack0
(c) ACK (d) premature timeout/ delayed
loss ACK
Transport Layer 3-41
Performance of rdt3.0
▪ rdt3.0 is correct, but performance stinks
▪ e.g.: 1 Gbps link, 15 ms prop. delay, 8000 bit packet:
L 8000 bits
Dtrans = R = = 8
109 bits/sec
microsecs
▪ U sender: utilization – fraction of time sender busy sending
User
types
‘C’ Seq=42, ACK=79, data =
‘C’ host ACKs
receipt of
‘C’,
Seq=79, ACK=43, data = echoes
host ‘C’
ACKs back ‘C’
receipt
of echoed Seq=43,
‘C’ ACK=80
simple telnet
scenario
sampleRT
T
EstimatedRTT
estimated “safety
RTT margin”
SendBase=9
Seq=92, 8 bytes of 2 Seq=92, 8 bytes of
data data
Seq=100, 20 bytes of
timeo
timeo
ACK=10 data
ut
ut
X 0
ACK=10
0
ACK=12
0
Seq=92, 8 bytes of Seq=92, 8
data SendBase=10 bytes of data
0
SendBase=12
ACK=10 0 ACK=12
0
0
SendBase=12
0
lost ACK premature
scenario timeout
Transport Layer 3-68
TCP: retransmission scenarios
Host Host
A B
Seq=92, 8 bytes of
data
Seq=100, 20 bytes of
data
ACK=10
timeo
X 0
ut
ACK=12
0
cumulative
ACK
Transport Layer 3-69
TCP ACK generation [RFC 1122, RFC 2581]
Seq=92, 8 bytes of
data
Seq=100, 20 bytes of
data
X
ACK=10
0
ACK=10
timeo
ut
0
ACK=10
0
ACK=10
0
Seq=100, 20 bytes of
data
IP
flow code
receiver controls sender, so
control
sender won’t overflow
receiver’s buffer by transmitting from
too much, too fast sender
receiver protocol
stack
Transport Layer 3-74
TCP flow control
▪ receiver “advertises” free
buffer space by including to application
process
rwnd value in TCP header
of receiver-to-sender
buffered data
segments RcvBuffe
r
• RcvBuffer size set via
socket options (typical default rwn free buffer
is 4096 bytes) d
space
• many operating systems
autoadjust RcvBuffer TCP segment
▪ sender limits amount of payloads
unacked (“in-flight”) data to receiver-side
receiver’s rwnd value buffering
▪ guarantees receive buffer
will not overflow
Transport Layer 3-75
Chapter 3 outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control
applicatio applicatio
n n
connection state: ESTAB connection state: ESTAB
connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
at server,client at server,client
network network
2-way handshake:
Q: will 2-way handshake
always work in
Let’s
network?
talk ESTA ▪ variable delays
O
ESTA K
B ▪ retransmitted messages (e.g.
B req_conn(x)) due to
message loss
▪ message reordering
choose
req_conn(x
▪ can’t “see” other side
x
) ESTA
acc_conn(x B
ESTA )
B
choose choose
req_conn(x x req_conn(x
x
) ESTA ) ESTA
retransmit B retransmit acc_conn(x B
acc_conn(x
req_conn(x req_conn(x )
)
) )
ESTA ESTA
B data(x+1 accept
B req_conn(x
retransmit ) data(x+1)
) data(x+1)
connection connection
client x server x server
completes
client completes
terminates forgets x terminates forgets x
req_conn(x
)
ESTA ESTA
data(x+1 accept
half open B B
) data(x+1)
connection!
(no client!) Transport Layer 3-79
TCP 3-way handshake
client server
state state
LISTE LISTE
N choose init seq num, N
x
SYNSEN send TCP SYN msg SYNbit=1,
T Seq=x choose init seq num,
y
send TCP SYNACK SYN
SYNbit=1, Seq=y msg, acking SYN RCVD
ACKbit=1;
received SYNACK(x) ACKnum=x+1
ESTA indicates server is live;
send ACK for SYNACK;
B this segment may ACKbit=1,
contain
client-to-server data ACKnum=y+1 received ACK(y)
indicates client is
live ESTA
B
close
d
Socket connectionSocket =
welcomeSocket.accept();
Λ Socket clientSocket =
SYN(x newSocket("hostname","port
number");
)
SYNACK(seq=y,
ACKnum=x+1) SYN(seq=x
create new socket for liste
)
communication back to client n
SY SY
N N
rcvd sent
SYNACK(seq=y,
ESTAB ACKnum=x+1)
ACK(ACKnum=y+1
ACK(ACKnum=y+1 )
)
Λ
CLOSED
R/2
dela
λout
y
λin R/2 λin R/2
▪ maximum per-connection ❖ large delays as arrival rate,
throughput: R/2 λin, approaches capacity
Transport Layer 3-86
Causes/costs of congestion: scenario 2
▪ one router, finite buffers
▪ sender retransmission of timed-out packet
• application-layer input = application-layer output: λin =
λout
• transport-layer input includes retransmissions : ‘λin λin
Host A
λout
▪ sender sends only when
router buffers available
λin R/2
A no buffer space!
Host B
Transport Layer 3-89
Causes/costs of congestion: scenario 2
Idealization: known R/
loss packets can be lost, 2
when sending at
dropped at router due R/2, some packets
λout
to full buffers are retransmissions
but asymptotic
▪ sender only resends if goodput is still R/2
packet known to be lost λin R/
(why?)
2
Host B
Transport Layer 3-90
Causes/costs of congestion: scenario 2
Realistic: duplicates R/
▪ packets can be lost, dropped 2
at router due to full buffers when sending at
R/2, some packets
λout
▪ sender times out prematurely, are retransmissions
λin
cop
timeout λout
λ' in
y
Host B
Transport Layer 3-91
Causes/costs of congestion: scenario 2
Realistic: duplicates R/
▪ packets can be lost, dropped 2
at router due to full buffers when sending at
R/2, some packets
λout
▪ sender times out prematurely, are retransmissions
“costs” of congestion:
▪ more work (retrans) for given “goodput”
▪ unneeded retransmissions: link carries multiple copies of pkt
• decreasing goodput
Host
D Host
C
C/
2
λout
λin’ C/
2
probing
for bandwidth
tim
e Transport Layer 3-96
TCP Congestion Control: details
sender sequence number
space cwn TCP sending rate:
d
▪ roughly: send cwnd
bytes, wait RTT for
last last byte
ACKS, then send
byte
ACKed
sent, not-
yet ACKed
sent more bytes
(“in-
flight”) cwn
▪ sender limits transmission: rat ~
~ d
RT
bytes/
sec
LastByteSent- <
e T
cwn
LastByteAcked d
RTT
segment
• done by incrementing
cwnd for every ACK four
received segments
Implementation:
▪ variable ssthresh
▪ on loss event, ssthresh
is set to 1/2 of cwnd just
before loss event
recovery
duplicate
ACK
cwnd = cwnd + MSS
transmit new segment(s), as
allowed
Transport Layer 3-101
TCP throughput
▪ avg. TCP thruput as function of window size, RTT?
• ignore slow start, assume always data to send
▪ W: window size (measured in bytes) where loss occurs
• avg. window size (# in-flight bytes) is ¾ W
• avg. thruput is 3/4W per RTT
3 W
avg TCP thruput bytes/
4 RT sec
=
T
W
W/
2
TCP connection
1
bottleneck
router
capacity
TCP connection
R
2
Connection 1 throughput R
Transport Layer 3-105
Fairness (more)
Fairness and UDP Fairness, parallel TCP
▪ multimedia apps often connections
do not use TCP ▪ application can open
• do not want rate multiple parallel
throttled by connections between
congestion control
two hosts
▪ instead use UDP:
• send audio/video at
▪ web browsers do this
constant rate, tolerate ▪ e.g., link of rate R with
packet loss 9 existing connections:
• new app asks for 1 TCP, gets
rate R/10
• new app asks for 11 TCPs,
gets R/2
ECN=00 ECN=11
IP datagram
Transport Layer 3-107
Chapter 3: summary
▪ principles behind
transport layer services: next:
• multiplexing, ▪ leaving the
demultiplexing network “edge”
• reliable data transfer (application,
• flow control transport layers)
• congestion control ▪ into the network
▪ instantiation, “core”
implementation in the ▪ two network layer
Internet chapters:
• UDP • data plane
• TCP • control plane
Transport Layer 3-108