Chapter 4: Transport Layer: Magda El Zarki Prof. of CS UC Irvine
Chapter 4: Transport Layer: Magda El Zarki Prof. of CS UC Irvine
Magda El Zarki
Prof. of CS
UC Irvine
Chapter 4: Transport Layer
Our goals:
❒ understand principles ❒ learn about transport
behind transport layer protocols in the
layer services: Internet:
❍ multiplexing/ ❍ UDP: connectionless
demultiplexing transport
❍ reliable data transfer ❍ TCP: connection-oriented
❍ flow control transport
❍ congestion control ❍ TCP congestion control
Chapter 4 outline
❒ 4.1 Transport-layer ❒ 4.5 Connection-oriented
services transport: TCP
❒ 4.2 Multiplexing and ❍ segment structure
demultiplexing ❍ reliable data transfer
❒ 4.3 Connectionless ❍ flow control
connection management
transport: UDP ❍
❒ 4.6 Principles of
❒ 4.4 Principles of
reliable data transfer congestion control
❒ 4.7 TCP congestion
control
Transport services and protocols
application
transport
❒ provide logical communication network
data link
between app processes physical
❍ connection setup
❒ unreliable, unordered
network
data link
physicalnetwork
delivery: UDP data link
physical
❍ no-frills extension of network
data link
“best-effort” IP
application
physical network transport
data link network
❍ delay guarantees
❍ bandwidth guarantees
Chapter 4 outline
❒ 4.1 Transport-layer ❒ 4.5 Connection-oriented
services transport: TCP
❒ 4.2 Multiplexing and ❍ segment structure
demultiplexing ❍ reliable data transfer
❒ 4.3 Connectionless ❍ flow control
connection management
transport: UDP ❍
❒ 4.6 Principles of
❒ 4.4 Principles of
reliable data transfer congestion control
❒ 4.7 TCP congestion
control
Multiplexing/demultiplexing
Demultiplexing at rcv host: Multiplexing at send host:
gathering data from multiple
delivering received segments
sockets, enveloping each data
to correct socket
with header (later used for
demultiplexing)
= socket = process
P3 P1
P1 P2 P4 application
application application
host 2 host 3
host 1
How demultiplexing works
❒ host receives IP datagrams
❍ each datagram has source 32 bits
IP address, destination IP
address source port # dest port #
P2 P1
P1
P3
P1 P4 P5 P6 P2 P1P3
SP: 5775
DP: 80
S-IP: B
D-IP:C
P1 P4 P2 P1P3
SP: 5775
DP: 80
S-IP: B
D-IP:C
❒ 3.6 Principles of
❒ 3.4 Principles of
reliable data transfer congestion control
❒ 3.7 TCP congestion
control
UDP: User Datagram Protocol [RFC 768]
❒ “no frills,” “bare bones”
Internet transport Why is there a UDP?
protocol
❒ no connection
❒ “best effort” service, UDP establishment (which can
segments may be: add delay)
❍ lost ❒ simple: no connection state
❍ delivered out of order at sender, receiver
to app ❒ small segment header
❒ connectionless: ❒ no congestion control: UDP
❍ no handshaking between can blast away as fast as
UDP sender, receiver desired
❍ each UDP segment
handled independently
of others
UDP: more
❒ often used for streaming
multimedia apps 32 bits
Sender: Receiver:
❒ treat segment contents as ❒ compute checksum of
sequence of 16-bit received segment
integers ❒ check if computed checksum
❒ checksum: addition (1’s equals checksum field value:
complement sum) of ❍ NO - error detected
segment contents ❍ YES - no error detected.
❒ sender puts checksum But maybe errors
value into UDP checksum nonetheless? More later
field ….
Internet Checksum Example
❒ Note
❍ When adding numbers, a carryout from the
most significant bit needs to be added to the
result
❒ Example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
Chapter 4 outline
❒ 4.1 Transport-layer ❒ 4.5 Connection-oriented
services transport: TCP
❒ 4.2 Multiplexing and ❍ segment structure
demultiplexing ❍ reliable data transfer
❒ 4.3 Connectionless ❍ flow control
connection management
transport: UDP ❍
❒ 4.6 Principles of
❒ 4.4 Principles of
reliable data transfer congestion control
❒ 4.7 TCP congestion
control
Principles of Reliable data transfer
❒ important in app., transport, link layers
❒ top-10 list of important networking topics!
send receive
side side
sender receiver
Rdt2.0: channel with bit errors
❒ underlying channel may flip bits in packet
❍ checksum to detect bit errors
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt2.0: operation with no errors
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt2.0: error scenario
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt2.0 has a fatal flaw!
What happens if ACK/ Handling duplicates:
NAK corrupted? ❒ sender retransmits current
❒ sender doesn’t know what pkt if ACK/NAK garbled
happened at receiver! ❒ sender adds sequence
❒ can’t just retransmit: number to each pkt
possible duplicate ❒ receiver discards (doesn’t
deliver up) duplicate pkt
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt2.1: discussion
Sender: Receiver:
❒ seq # added to pkt ❒ must check if received
❒ two seq. #’s (0,1) will packet is duplicate
suffice. Why? ❍ state indicates whether
0 or 1 is expected pkt
❒ must check if received seq #
ACK/NAK corrupted
❒ note: receiver cannot
❒ twice as many states know if its last ACK/
❍ state must “remember” NAK received OK at
whether “current” pkt
sender
has 0 or 1 seq. #
rdt2.2: a NAK-free protocol
L 8000bits
dtrans = = 9
= 8 microseconds
R 10 bps
❍ U sender: utilization – fraction of time sender busy sending
L/R .008
U = = = 0.00027
sender 30.008
RTT + L / R microsec
onds
❍ 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link
❍ network protocol limits use of physical resources!
rdt3.0: stop-and-wait operation
sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
L/R .008
U = = = 0.00027
sender 30.008
RTT + L / R microsec
onds
Pipelined protocols
Pipelining: sender allows multiple, “in-flight”, yet-
to-be-acknowledged pkts
❍ range of sequence numbers must be increased
❍ buffering at sender and/or receiver
Increase utilization
by a factor of 3!
3*L/R .024
U = = = 0.0008
sender 30.008
RTT + L / R microsecon
ds
Pipelining Protocols
Go-back-N: big picture: Selective Repeat: big pic
❒ Sender can have up to ❒ Sender can have up to
N unacked packets in N unacked packets in
pipeline pipeline
❒ Rcvr only sends ❒ Rcvr acks individual
cumulative acks packets
❍ Doesn’t ack packet if ❒ Sender maintains
there’s a gap in packet timer for each
reception (missing)
unacked packet
❒ Sender has timer for ❍ When timer expires,
oldest unacked packet retransmit only unack
❍ If timer expires, packet
retransmit all unacked
packets
Selective repeat: big picture
❒ Sender can have up to N unacked packets
in pipeline
❒ Rcvr acks individual packets
❒ Sender maintains timer for each unacked
packet
❍ When timer expires, retransmit only unack
packet
Go-Back-N
Sender:
❒ k-bit seq # in pkt header
❒ “window” of up to N, consecutive unack’ed pkts allowed
❒ receiver sees no
difference in two
scenarios!
❒ incorrectly passes
duplicate data as new
in (a)
Q: what relationship
between seq # size
and window size?
Chapter 4 outline
❒ 4.1 Transport-layer ❒ 4.5 Connection-oriented
services transport: TCP
❒ 4.2 Multiplexing and ❍ segment structure
demultiplexing ❍ reliable data transfer
❒ 4.3 Connectionless ❍ flow control
connection management
transport: UDP ❍
❒ 4.6 Principles of
❒ 4.4 Principles of
reliable data transfer congestion control
(rdt) ❒ 4.7 TCP congestion
control
TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581
350
300
250
RTT (milliseconds)
200
150
100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
DevRTT = (1-β)*DevRTT +
β*|SampleRTT-EstimatedRTT|
(typically, β = 0.25)
❒ 4.6 Principles of
❒ 4.4 Principles of
reliable data transfer congestion control
❒ 4.7 TCP congestion
control
TCP reliable data transfer
❒ TCP creates rdt ❒ Retransmissions are
service on top of IP’s triggered by:
unreliable service ❍ timeout events
❒ Pipelined segments ❍ duplicate acks
❒ Cumulative acks ❒ Initially consider
❒ TCP uses single
simplified TCP sender:
ignore duplicate acks
retransmission timer ❍
❍ ignore flow control,
congestion control
TCP sender events:
data rcvd from app: timeout:
❒ Create segment with ❒ retransmit segment
seq # that caused timeout
❒ seq # is byte-stream ❒ restart timer
number of first data Ack rcvd:
byte in segment ❒ If acknowledges
❒ start timer if not previously unacked
already running (think segments
of timer as for oldest ❍ update what is known to
unacked segment) be acked
❒ expiration interval: ❍ start timer if there are
TimeOutInterval outstanding segments
NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum
Seq=9 Seq=9
2, 8 b 2, 8 b
ytes d ytes d
ata Seq= ata
Seq=92 timeout
100,
20 by
tes d
a
timeout
ta
=100
ACK
X
loss
Seq=9 Seq=9
2, 8 b
2, 8 b
ytes d Sendbase ytes d
ata
ata
= 100
Seq=92 timeout
SendBase
= 120
= 100
ACK
SendBase
= 100 SendBase
= 120 premature timeout
time time
lost ACK scenario
TCP retransmission scenarios (more)
Host A Host B
Seq=9
2, 8 b
ytes d
ata
=100
timeout
Seq=1 A C K
00, 20
bytes
data
X
loss
SendBase = 120
ACK
= 120
time
Cumulative ACK scenario
TCP ACK generation [RFC 1122, RFC 2581]
X
timeout
resen
d 2 nd s
egme
nt
time
❒ 4.6 Principles of
❒ 4.4 Principles of
reliable data transfer congestion control
❒ 4.7 TCP congestion
control
TCP Flow Control
flow control
sender won’t overflow
❒ receive side of TCP receiver’s buffer by
connection has a transmitting too much,
receive buffer: too fast
❒ speed-matching
service: matching the
send rate to the
receiving app’s drain
rate
❒ app process may be
slow at reading from
buffer
TCP Flow control: how it works
❒ Rcvr advertises spare
room by including value
of RcvWindow in
segments
❒ Sender limits unACKed
(Suppose TCP receiver data to RcvWindow
discards out-of-order ❍ guarantees receive
segments) buffer doesn’t overflow
❒ spare room in buffer
= RcvWindow
= RcvBuffer-[LastByteRcvd -
LastByteRead]
Chapter 4 outline
❒ 4.1 Transport-layer ❒ 4.5 Connection-oriented
services transport: TCP
❒ 4.2 Multiplexing and ❍ segment structure
demultiplexing ❍ reliable data transfer
❒ 4.3 Connectionless ❍ flow control
connection management
transport: UDP ❍
❒ 4.6 Principles of
❒ 4.4 Principles of
reliable data transfer congestion control
❒ 4.7 TCP congestion
control
TCP Connection Management
Recall: TCP sender, receiver Three way handshake:
establish “connection”
before exchanging data Step 1: client host sends TCP
segments SYN segment to server
❒ initialize TCP variables: ❍ specifies initial seq #
❍ seq. #s ❍ no data
close
client closes socket: FIN
clientSocket.close();
timed wait
ACK
Step 2: server receives
FIN, replies with ACK.
Closes connection, sends
FIN. closed
TCP Connection Management (cont.)
timed wait
ACK
.
closed
closed
TCP Connection Management (cont)
TCP server
lifecycle
TCP client
lifecycle
Chapter 4 outline
❒ 4.1 Transport-layer ❒ 4.5 Connection-oriented
services transport: TCP
❒ 4.2 Multiplexing and ❍ segment structure
demultiplexing ❍ reliable data transfer
flow control
❒ 4.3 Connectionless
❍
connection management
transport: UDP ❍
❒ 4.6 Principles of
❒ 4.4 Principles of
reliable data transfer congestion control
❒ 4.7 TCP congestion
control
Principles of Congestion Control
Congestion:
❒ informally: “too many sources sending too much
data too fast for network to handle”
❒ different from flow control!
❒ manifestations:
❍ lost packets (buffer overflow at routers)
❍ long delays (queueing in router buffers)
❒ a top-10 problem!
Causes/costs of congestion: scenario 1
Host A λout
❒ two senders, two
λin : original data
receivers
unlimited shared
❒ one router,
Host B
output link buffers
infinite buffers
❒ no retransmission
❒ large delays
when congested
❒ maximum
achievable
throughput
Causes/costs of congestion: scenario 2
R/3
λout
λout
λout
R/4
a. b. c.
“costs” of congestion:
❒ more work (retrans) for given “goodput”
❒ unneeded retransmissions: link carries multiple copies of pkt
Causes/costs of congestion: scenario 3
❒ four senders
Q: what happens as λ
❒ multihop paths in
and λ
increase ?
❒ timeout/retransmit in
Host A λout
λin : original data
λ'in : original data, plus
retransmitted data
Host B
Causes/costs of congestion: scenario 3
H λ
o
o
st u
A t
H
o
st
B
❒ 4.6 Principles of
❒ 4.4 Principles of
reliable data transfer congestion control
❒ 4.7 TCP congestion
control
TCP congestion control: additive increase,
multiplicative decrease (AIMD)
❒ Approach: increase transmission rate (window size),
probing for usable bandwidth, until loss occurs
❍ additive increase: increase CongWin by 1 MSS
every RTT until loss detected
❍ multiplicative decrease: cut CongWin in half after
loss congestion
window
congestion window size
24 Kbytes
Saw tooth
behavior: probing
16 Kbytes
for bandwidth
8 Kbytes
time
time
TCP Congestion Control: details
❒ sender limits transmission: How does sender
LastByteSent-LastByteAcked perceive congestion?
≤ CongWin ❒ loss event = timeout or
❒ Roughly, 3 duplicate acks
CongWin ❒ TCP sender reduces
rate = Bytes/sec
RTT rate (CongWin) after
❒ CongWin is dynamic, function
loss event
of perceived network three mechanisms:
congestion ❍ AIMD
❍ slow start
❍ conservative after
timeout events
TCP Slow Start
❒ When connection begins, ❒ When connection begins,
CongWin = 1 MSS increase rate
❍ Example: MSS = 500 exponentially fast until
bytes & RTT = 200 msec first loss event
❍ initial rate = 20 kbps
❒ available bandwidth may
be >> MSS/RTT
❍ desirable to quickly ramp
up to respectable rate
TCP Slow Start (more)
❒ When connection Host A Host B
begins, increase rate
exponentially until
one segm
ent
RTT
first loss event:
two segm
❍ double CongWin every ents
RTT
❍ done by incrementing
CongWin for every ACK four segm
ents
received
❒ Summary: initial rate
is slow but ramps up
exponentially fast time
Refinement: inferring loss
❒ After 3 dup ACKs:
❍ CongWin is cut in half Philosophy:
❍ window then grows
linearly q 3 dup ACKs indicates
❒ But after timeout event: network capable of
delivering some segments
❍ CongWin instead set to
q timeout indicates a
1 MSS;
“more alarming”
❍ window then grows congestion scenario
exponentially
❍ to a threshold, then
grows linearly
Refinement
Q: When should the
exponential
increase switch to
linear?
A: When CongWin
gets to 1/2 of its
value before
timeout.
Implementation:
❒ Variable Threshold
❒ At loss event, Threshold is
set to 1/2 of CongWin just
before loss event
Summary: TCP Congestion Control
TCP connection 1
bottleneck
TCP
router
connection 2
capacity R
Why is TCP fair?
Two competing sessions:
❒ Additive increase gives slope of 1, as throughout increases
❒ multiplicative decrease decreases throughput proportionally
Connection 1 throughput R
Fairness (more)
Fairness and UDP Fairness and parallel TCP
❒ Multimedia apps often
connections
do not use TCP ❒ nothing prevents app from
❍ do not want rate opening parallel
throttled by congestion connections between 2
control hosts.
❒ Instead use UDP: ❒ Web browsers do this
❍ pump audio/video at
constant rate, tolerate
packet loss
❒ Research area: TCP
friendly