Lec 17-23 TCP Protocol-Flow Control-Congestion Control
Lec 17-23 TCP Protocol-Flow Control-Congestion Control
• Transport Layer
– TCP Protocol
• Connection Establishment
• TCP Segment Structure
• Reliable data transfer
• Flow control
• Congestion control
2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP [RFCs: 793,1122,1223,2018,2581]
4
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP: Wireshark Capture
5
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Sequence Numbers and ACKs
outgoing segment from sender
Sequence Numbers: source port # dest port #
sequence number
• Byte stream “number” of acknowledgement number
rwnd
first byte in segment’s data checksum urg pointer
window size
Acknowledgements: N
application application
network network
choose x
req_conn(x)
ESTAB
acc_conn(x)
ESTAB
8
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP 3-way Handshake
9
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Retransmission Scenarios
Host A Host B Host A Host B
SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data
timeout
ACK=100
X
ACK=100
ACK=120
SendBase=120
ACK=100
X
ACK=120
13
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Is TCP GBN or SR…?
14
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Timeout
15
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Quiz
16
Computer Networks (CS F303) BITS Pilani, Pilani Campus
RTT Estimation
EstimatedRTT = (1-)*EstimatedRTT + *SampleRTT
– Influence of past sample decreases exponentially fast
– Typical value of = 0.125 RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
350
300
250
RTT (milliseconds)
200
150
100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
17
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Timeout Interval
• Timeout Interval
– Estimated RTT + “Safety margin”
– Large variation in Estimated RTT large safety margin
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically, = 0.25)
18
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Example: Timeout Interval
• Consider three RTT samples (in ms): 150, 200 and 210 in that order. Assume
initial estimated RTT= 200 ms, initial DevRTT = 50 ms, β = 0.25 and α = 0.125
19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Connection Close
LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime
CLOSED
20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Connection States-Client and Server
21
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Flow Control
22
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Congestion Control
• What is congestion?
– Too many sources sending too
much data too fast for network
to handle
• Congestion results in
– Packet losses
– Packet delays
– Throughput reduction
23
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Causes/Costs of Congestion: Scenario-1
original data: lin throughput: lout
Simplest scenario:
Host A
One router, infinite buffers
infinite shared
Input, output link capacity: R output link buffers
Two flows
R R
No retransmissions needed
Host B
R/2
Q: What happens as lout
arrival rate lin
delay
throughput:
approaches R/2?
lin R/2 lin R/2
maximum per-connection large delays as arrival rate
throughput: R/2 lin approaches capacity 24
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Causes/Costs of Congestion: Scenario-2
One router, finite buffers
Sender retransmits lost, and timed-out packet
• Application-layer input = application-layer output: lin = lout
• Transport-layer input includes retransmissions : l’in lin
R R
throughput: lout
Packets can be lost, dropped at router due to to un-needed
retransmissions
full buffers – requiring retransmissions
But sender time-out prematurely, sending two when sending at
R/2, some packets
copies, both of which are delivered are retransmissions,
including needed
and un-needed
Host A lin : original data lin R/2 duplicates, that are
delivered!
l'in: original data, plus lout
retransmitted data
Costs of congestion:
More work for given receiver
throughput
Unneeded retransmissions:
R R Decreasing maximum achievable
throughput
Host B finite shared output
link buffers 26
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Causes/Costs of Congestion: Scenario-3
throughput: lout
lin R/2
delay
R/2
lin R/2
lout
Loss/retransmission decreases effective
throughput:
throughput
R/2
lin
Un-needed duplicates further decreases effective
throughput: lout
R/2
throughput
R/2
lin R/2
lout
wasted for packets lost downstream
lin’ R/2
28
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Approaches Towards Congestion Control [.1]
data data
ACKs
ACKs
29
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Approaches Towards Congestion Control [..2]
30
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Congestion Control
• Approach
– Sending rate is a function of perceived congestion
31
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Congestion Control
33
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Fast Retransmit: Example
ACK=100
timeout
ACK=100
ACK=100
ACK=100
Seq=100, 20 bytes of data
34
Computer Networks (CS F303) BITS Pilani, Pilani Campus
How to start a connection: TCP Slow Start
Host A Host B
RTT
time
35
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Switching from Slow Start to Congestion
Avoidance
36
Computer Networks (CS F303) BITS Pilani, Pilani Campus
FSM Description of TCP Congestion Control
New
New ACK!
ACK! new ACK
duplicate ACK
dupACKcount++ new ACK .
cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount = 0
cwnd = cwnd+MSS transmit new segment(s), as allowed
dupACKcount = 0
L transmit new segment(s), as allowed
cwnd = 1 MSS
ssthresh = 64 KB cwnd > ssthresh
dupACKcount = 0
slow L congestion
start timeout avoidance
ssthresh = cwnd/2
cwnd = 1 MSS duplicate ACK
timeout dupACKcount = 0 dupACKcount++
ssthresh = cwnd/2 retransmit missing segment
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
timeout
New
ACK!
ssthresh = cwnd/2
cwnd = 1 New ACK
dupACKcount = 0
cwnd = ssthresh dupACKcount == 3
dupACKcount == 3 retransmit missing segment dupACKcount = 0
ssthresh= cwnd/2 ssthresh= cwnd/2
cwnd = ssthresh + 3 cwnd = ssthresh + 3
retransmit missing segment
retransmit missing segment
fast
recovery
duplicate ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed
37
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Sawtooth Behavior
Congestion
Window Congestion Timeouts
avoidance may still
occur
38
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Congestion Control: AIMD
Approach: senders can increase sending rate until packet loss (congestion) occurs,
then decrease sending rate on loss event
Additive Increase Multiplicative Decrease
Increase sending rate by 1 Cut sending rate in half at each
maximum segment size every RTT loss event
until loss detected
AIMD sawtooth
TCP sender Sending rate
behavior: probing
for bandwidth
39
time
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Reno Throughput: Macroscopic Model
W
TCP sender Sending rate
W/2
time
40
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Example: TCP Congestion Control
41
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Fairness
TCP connection 1
bottleneck
router
capacity R
TCP connection 2
42
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Is TCP Fair?
Fairness Line
Overload
User 2’s
Allocation
x2 Optimal point
Underutilization
Efficiency Line
43
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Additive Increase/Decrease
• Both X1 and X2 increase/decrease by the same amount over time
– The additive increase/decrease policy of increasing both users’ allocations by
aI corresponds to moving along a 450 line
Fairness Line
T1
User 2’s
Allocation T0
x2
Efficiency Line
Fairness Line
T1
User 2’s
Allocation
x2 T0
Efficiency Line
45
Computer Networks (CS F303) BITS Pilani, Pilani Campus
What is the Right Choice?
Fairness Line
x1
x0
User 2’s
Allocation
x2 x2
Efficiency Line
48
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Goals for TCP Fairness and Throughput
49
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Compound TCP Implementation
• Default TCP implementation in Windows 2008 TCP Stack
– Good for connections with large “bandwidth-delay” products
– Make congestion decisions that modifies the transmission rate based on RTT variation
• Key idea: split cwnd into two separate windows
Slow Start
Time
• Pros: fast ramp up, more fair to flows with different RTTs
• Cons: must estimate precise value of RTT, which is challenging
51
52
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Cubic
• W(t) = C( t – K)3 + Wmax
- Increase W as a function of the cube
of the distance between current time
t and K (future point in time when W
reach again Wmax)
- K is a time period that the function
takes to increase W to Wmax (when
there is no further loss)
K = 3 Wb / C
- β multiplicative decrease factor
- C is a CUBIC parameter
- t is the time elapsed since last
window reduction
53
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Reno vs TCP CUBIC Performance
54
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP and the congested “bottleneck link”
TCP (classic, CUBIC) increase TCP’s sending rate until packet loss occurs
at some router’s output: the bottleneck link
source destination
application application
TCP TCP
network network
link link
physical physical
packet queue almost
never empty, sometimes
overflows packet (loss)
59
Advanced Computer Networks CS G525 BITS Pilani, Pilani Campus
Evolving Transport-Layer Functionality
• TCP, UDP: principal transport protocols for 40 years
• Different “flavors” of TCP developed, for specific scenarios:
Scenario Challenges
Long, fat pipes (large data Many packets “in flight”; loss shuts down
transfers) pipeline
Wireless networks Loss due to noisy wireless links, mobility;
TCP treat this as congestion loss
Long-delay links Extremely long RTTs
Data center networks Latency sensitive
Background traffic flows Low priority, “background” TCP flows
Network IP IP
61
Computer Networks (CS F303) BITS Pilani, Pilani Campus
QUIC: Quick UDP Internet Connections
TCP handshake
(transport layer) QUIC handshake
data
TLS handshake
(security)
data
GET GET
HTTP
GET QUIC QUIC QUIC QUIC QUIC QUIC
encrypt encrypt encrypt encrypt encrypt encrypt
QUIC QUIC QUIC QUIC QUIC QUIC
TLS encryption TLS encryption RDT RDT RDT RDT
error!
RDT RDT
66
Computer Networks (CS F303) BITS Pilani, Pilani Campus