0% found this document useful (0 votes)
15 views112 pages

Chapter - 3 - V7.01-Transport Layer

Uploaded by

bombmaker0607
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views112 pages

Chapter - 3 - V7.01-Transport Layer

Uploaded by

bombmaker0607
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 112

Chapter 3

Transport
Layer

Computer
Networking: A
Top Down
Approach
7th Edition, Global Edition
Jim Kurose, Keith Ross
Pearson
April 2016
Transport Layer 2-1
Chapter 3: Transport Layer
our goals:
 understand  learn about
principles Internet transport
behind transport layer protocols:
layer services: • UDP:
• multiplexing, connectionless
demultiplexing transport
• reliable data • TCP: connection-
transfer oriented reliable
• flow control transport
• congestion • TCP congestion
control control
Transport Layer 3-2
Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-3
Transport services and
protocols applicatio
 provide logical n
transport
communication between network
data link
app processes running on physical

different hosts, as if

lo
gi
ca
processes directly

enl
connected

d-
en
 transport protocols run in

d
tr
end systems

a
ns
po
• send side: breaks app

rt
messages into segments, applicatio
passes to network layer n
transport
• rcv side: reassembles network

segments into messages, data link


physical
passes to app layer
 more than one transport
protocol available to apps
• Internet: TCP and Transport Layer 3-4
Transport vs. network
layer
 network layer: logical household analogy:
communication 12 kids in Ann’s house
between hosts sending letters to 12 kids in
 transport layer: logical Bill’s house:
communication  hosts = houses
between processes  processes = kids
• relies on, enhances,  app messages = letters in
network layer services envelopes
 transport protocol = Ann
and Bill who demux to in-
house siblings
 network-layer protocol =
postal service
 Different kids may provide
different service (TCP/UDP)

Transport Layer 3-5


Internet transport-layer
protocols
 reliable, in-order applicatio
n
transport
delivery (TCP) network
data link
• congestion control physical network

lo
network data link

gi
• flow control data link physical

ca
physical
network

l
• connection setup

en
data link

 unreliable, unordered

d-
physical

en
network

d
delivery: UDP data link

tr
a
physical

ns
• no-frills extension of network

po
data link
“best-effort” IP

r
physical

t
network
 services not available: data link
physical
applicatio
network n
• delay guarantees data link transport
network
physical
• bandwidth guarantees data link
physical

Transport Layer 3-6


Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-7
3.2
multiplexing/demultiplexing
multiplexing at sender:
demultiplexing at receiver:
handle data from multiple
sockets, add transport header use header info to deliver
(later used for received segments to correct
demultiplexing) socket

application

application P1 P2 application socket


P3 transport P4
process
transport network transport
network link network
link physical link
physical physical

Transport Layer 3-8


How demultiplexing works
 host receives IP 32 bits
datagrams source port # dest port #
• each datagram has source
IP address, destination IP
address other header fields
• each datagram carries one
transport-layer segment
• each segment has source, application
destination port number data
 host uses IP addresses & (payload)
port numbers to direct
segment to appropriate
socket TCP/UDP segment format

Transport Layer 3-9


Connectionless
demultiplexing
 recall: created socket  recall: when creating
has host-local port #: datagram to send into
DatagramSocket mySocket1 UDP socket, must
= new specify
DatagramSocket(12534);
• destination IP address
• destination port #

 when host receives IP datagrams with


UDP segment: same dest. port #,
• checks destination port but different source
# in segment IP addresses and/or
• directs UDP segment to source port # will be
socket with that port # directed to same
socket at dest

Transport Layer 3-10


Connectionless demux:
example
DatagramSocket
DatagramSocket serverSocket = new DatagramSocket
mySocket2 = new DatagramSocket mySocket1 = new
DatagramSocket (6428); DatagramSocket
(9157); application
(5775);
application P1 application
P3 P4
transport
transport transport
network
network link network
link physical link
physical physical

source port: 6428 source port: ?


dest port: 9157 dest port: ?

source port: 9157 source port: ?


dest port: 6428 dest port: ?

Transport Layer 3-11


Connection-oriented
demux
 TCP socket identified  server host may
by 4-tuple: support many
• source IP address simultaneous TCP
• source port # sockets:
• dest IP address • each socket identified by
• dest port # its own 4-tuple
 demux: receiver uses  web servers have
all four values to different sockets for
direct segment to each connecting client
appropriate socket • non-persistent HTTP will
have different socket for
each request

Transport Layer 3-12


Connection-oriented demux:
example
application
application P4 P5 P6 application
P3 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: physical
IP
address
B
host: IP source IP,port: B,80 host: IP
address dest IP,port: A,9157 source IP,port: C,5775 address
A dest IP,port: B,80 C
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80
three segments, all destined to IP address: B,
dest port: 80 are demultiplexed to different sockets Transport Layer 3-13
Connection-oriented demux:
example
threaded server
application
application application
P4
P3 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: physical
IP
address
B
host: IP source IP,port: B,80 host: IP
address dest IP,port: A,9157 source IP,port: C,5775 address
A dest IP,port: B,80 C
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80

Transport Layer 3-14


Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-15
UDP: User Datagram Protocol [RFC 768]

 “no frills,” “bare bones”  UDP use:


Internet transport  streaming multimedia apps
protocol (loss tolerant, rate
 “best effort” service, UDP sensitive)
 DNS
segments may be:  SNMP
• lost  RIP
• delivered out-of-  reliable transfer over
order to app UDP:
 add reliability at
 connectionless:
application layer
• no handshaking  application-specific error
between UDP recovery!
sender, receiver
• each UDP segment
handled
independently of Transport Layer 3-16
UDP: segment header
length, in bytes of
32 bits UDP segment,
source port # dest port # including header

length checksum
why is there a UDP?
 no connection
application establishment (which can
data add delay)
(payload)  simple: no connection
state at sender, receiver
 small header size (8B, TCP
20B)
 no congestion control:
UDP segment format
UDP can blast away as fast
as desired

Transport Layer 3-17


UDP checksum
Goal: detect “errors” (e.g., flipped bits) in
transmitted segment
sender: receiver:
 treat segment  compute checksum of
contents, including received segment
header fields, as  check if computed
sequence of 16-bit
integers checksum equals
 checksum: addition checksum field value:
(one’s complement • NO - error detected
sum) of segment • YES - no error detected.
contents But maybe errors
nonetheless? More later
 sender puts checksum ….
value into UDP
checksum field Transport Layer 3-18
Internet checksum:
example
example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1

Note: when adding numbers, a carryout from the


most significant bit needs to be added to the
result

Receiver: it should be 16 1s, if there are no


* Check out the online interactive exercises for
more examples: Transport Layer 3-19
http://gaia.cs.umass.edu/kurose_ross/interactive/
Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-20
Principles of reliable data
transfer
important in application, transport,
link layers
• top-10 list of important networking
topics!
Networ
k layer

 characteristics of unreliable channel will


determine complexity of reliable data transfer
protocol (rdt) Transport Layer 3-21
Principles of reliable data
transfer
important in application, transport,
link layers
• top-10 list of important networking
topics!
Networ
k layer

 characteristics of unreliable channel will


determine complexity of reliable data transfer
protocol (rdt) Transport Layer 3-22
Principles of reliable data
transfer
important in application, transport,
link layers
• top-10 list of important networking
topics!
Networ
k layer

 characteristics of unreliable channel will


determine complexity of reliable data transfer
protocol (rdt) Transport Layer 3-23
Reliable data transfer: getting
started
rdt_send(): called from above, deliver_data(): called
(e.g., by app.). Passed data to by rdt to deliver data to
deliver to receiver upper layer upper

send receive
side side

udt_send(): called by rdt, rdt_rcv(): called when packet


to transfer packet over arrives on rcv-side of channel
unreliable channel to
receiver
Transport Layer 3-24
Reliable data transfer: getting
started
we’ll:
 incrementally develop sender, receiver sides of
reliable data transfer protocol (rdt)
 consider only unidirectional data transfer
• but control info will flow on both directions!
 use finite state machines (FSM) to specify
sender, receiver

event causing state transition


actions taken on state transition
state: when in this
“state” next state state state
uniquely 1 event
determined by 2
actions
next event

Transport Layer 3-25


rdt1.0: reliable transfer over a reliable
channel
 underlying channel perfectly
reliable
• no bit errors
• no loss of packets
 separate FSMs for sender, receiver:
• sender sends data into underlying
channel
rdt_send(data) Wait for rdt_rcv(packet)

Wait for
call receiver
from reads data from underlying
call from extract (packet,data)
packet =
channel
above make_pkt(data) below deliver_data(data)
udt_send(packet)

sender receiver

Transport Layer 3-26


rdt2.0: channel with bit
errors
 underlying channel may flip bits in
packet
• checksum to detect bit errors
 the question: how to recover from
errors:
• acknowledgements (ACKs): receiver
Howexplicitly
do tells sender
humans thatfrom
recover pkt received
“errors”
OK
• negativeduring conversation?
acknowledgements (NAKs):
receiver explicitly tells sender that pkt
had errors
• sender retransmits pkt on receipt of
NAK
 new mechanisms in rdt2.0 (beyond
rdt1.0):
Transport Layer 3-27
rdt2.0: channel with bit
errors
 underlying channel may flip bits in
packet
• checksum to detect bit errors
 the question: how to recover from
errors:
• acknowledgements (ACKs): receiver
explicitly tells sender that pkt received
OK
• negative acknowledgements (NAKs):
receiver explicitly tells sender that pkt
had errors
• sender retransmits pkt on receipt of
NAK
 Transport Layer 3-28
rdt2.0: FSM specification
rdt_send(data)
sndpkt = make_pkt(data, receiver
checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt)
call from ACK or udt_send(sndp &&
above NAK kt) corrupt(rcvpkt)
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
L
call from
below
sender
rdt_rcv(rcvpkt) &&

notcorrupt(rcvpkt)
extract(rcvpkt,data
)
deliver_data(data)
udt_send(ACK)
Transport Layer 3-29
rdt2.0: operation with no
errors
rdt_send(data)
snkpkt = make_pkt(data,
checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt)
call from ACK or udt_send(sndp &&
above NAK kt) corrupt(rcvpkt)
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
L call from
below

rdt_rcv(rcvpkt) &&

notcorrupt(rcvpkt)
extract(rcvpkt,data
)
deliver_data(data)
udt_send(ACK)
Transport Layer 3-30
rdt2.0: error scenario
rdt_send(data)
snkpkt = make_pkt(data,
checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt)
call from ACK or udt_send(sndp &&
above NAK kt) corrupt(rcvpkt)
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
L call from
below

rdt_rcv(rcvpkt) &&

notcorrupt(rcvpkt)
extract(rcvpkt,data
)
deliver_data(data)
udt_send(ACK)
Transport Layer 3-31
rdt2.0 has a fatal flaw!
what happens if handling duplicates:
ACK/NAK  sender retransmits
corrupted? current pkt if ACK/NAK
 sender doesn’t know corrupted
what happened at  sender adds sequence
receiver! number to each pkt
 can’t just retransmit:  receiver discards
possible duplicate (doesn’t deliver up)
duplicate pkt
stop and wait
sender sends one
packet,
then waits for receiver
response
1-bit sequence # is Transport Layer 3-32
rdt2.1: sender, handles garbled
ACK/NAKs
rdt_send(data)
sndpkt = make_pkt(0, data,
checksum) rdt_rcv(rcvpkt) &&
udt_send(sndpkt) ( corrupt(rcvpkt) ||
Wait for Wait for
ACK or
isNAK(rcvpkt) )
call 0
NAK 0 udt_send(sndpkt)
from
rdt_rcv(rcvpkt) above
&& rdt_rcv(rcvpkt)
notcorrupt(rcvpkt) && notcorrupt(rcvpkt)
&& isACK(rcvpkt) && isACK(rcvpkt)
L
L
Wait for Wait for
ACK or call 1
rdt_rcv(rcvpkt) NAK 1 from
&& above
( corrupt(rcvpkt) rdt_send(data)
||
udt_send(sndpk sndpkt = make_pkt(1, data,
isNAK(rcvpkt)
t) ) checksum)
udt_send(sndpkt)

Transport Layer 3-33


rdt2.1: receiver, handles garbled
ACK/NAKs
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && rdt_rcv(rcvpkt) &&
(corrupt(rcvpkt) (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, sndpkt = make_pkt(NAK,
chksum) chksum)
udt_send(sndpkt) Wait Wait udt_send(sndpkt)
rdt_rcv(rcvpkt) && for for rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) && 0 from 1 from not corrupt(rcvpkt) &&
has_seq1(rcvpkt) below below has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, sndpkt = make_pkt(ACK,
chksum) chksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
udt_send(sndpkt)
&& has_seq1(rcvpkt)

extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)

Transport Layer 3-34


rdt2.1: discussion
sender: receiver:
 seq # added to pkt  must check if received
 two seq. #’s (0,1) packet is duplicate
will suffice. Why? • state indicates whether 0
 must check if or 1 is expected pkt seq #
 note: receiver can not
received ACK/NAK
corrupted know if its last ACK/NAK
 twice as many received OK at sender
states
• state must
“remember” whether
“expected” pkt
should have seq # of
0 or 1

Transport Layer 3-35


rdt2.2: a NAK-free protocol
 same functionality as rdt2.1, using
ACKs only
 instead of NAK, receiver sends ACK
for last pkt received OK
• receiver must explicitly include seq # of
pkt being ACKed
 duplicate ACK at sender results in
same action as NAK: retransmit
current pkt

Transport Layer 3-36


rdt2.2: sender, receiver
fragments
rdt_send(data)
sndpkt = make_pkt(0, data,
checksum) rdt_rcv(rcvpkt) &&
udt_send(sndpkt) ( corrupt(rcvpkt) ||
Wait for Wait for
ACK isACK(rcvpkt,1) )
call 0
from 0 udt_send(sndpkt)
above sender FSM
fragment rdt_rcv(rcvpkt)
&&
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
(corrupt(rcvpkt) || && isACK(rcvpkt,0)
L
has_seq1(rcvpkt)) Wait receiver FSM
for
udt_send(sndpkt) 0 from fragment
below
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
udt_send(sndpkt) Transport Layer 3-37
rdt3.0: channels with errors and
loss
new assumption: underlying approach:
channel can
sender
alsowaits
lose packets (data, ACKs) “reasonable” amount of
time for ACK
• checksum, seq. #, ACKs, retransmissions will be of
help … but not enough  retransmits if no ACK
received in this time
 if pkt (or ACK) just delayed
(not lost):
• retransmission will be
duplicate, but seq. #’s
already handles this
• receiver must specify seq
# of pkt being ACKed
 requires countdown timer

Transport Layer 3-38


rdt3.0
sender rdt_send(data)
rdt_rcv(rcvpkt)
sndpkt = make_pkt(0, data, checksum)&&
udt_send(sndpkt) ( corrupt(rcvpkt)
rdt_rcv(rcvpkt start_timer || L
) isACK(rcvpkt,1) )
L Wait for Wait
for timeout
call 0from
ACK0 udt_send(sndpkt)
above
start_timer
rdt_rcv(rcvpkt)
&& rdt_rcv(rcvpkt)
notcorrupt(rcvpkt) &&
&& isACK(rcvpkt,1)
stop_timer notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
stop_timer
Wait Wait for
timeout for call 1 from
udt_send(sndpkt) ACK1 above
start_timer rdt_rcv(rcvpkt
) L
rdt_send(data)
rdt_rcv(rcvpkt)
&& sndpkt = make_pkt(1, data,
( corrupt(rcvpkt checksum)
) || udt_send(sndpkt)
L
isACK(rcvpkt,0) start_timer
)
Transport Layer 3-39
rdt3.0 in
action
sender receiver sender receiver
send pkt0 pkt0 send pkt0 pkt0
rcv pkt0 rcv pkt0
ack send ack0 ack0 send ack0
rcv ack0 0 rcv ack0
send pkt1 pkt1 send pkt1 pkt1
rcv pkt1 X
ack1 send ack1 loss
rcv ack1
send pkt0 pkt0
rcv pkt0 timeout
ack0 send ack0 resend pkt1 pkt1
rcv pkt1
ack1 send ack1
rcv ack1
send pkt0 pkt0
(a) no loss rcv pkt0
ack0 send ack0

(b) packet loss


Transport Layer 3-40
rdt3.0 in
action sender receiver
sender receiver send pkt0 pkt0
send pkt0 pkt0 rcv pkt0
ack send ack0
rcv pkt0
send ack0 rcv ack0 0
ack send pkt1 pkt
rcv ack0 0 1 rcv pkt1
send pkt1 pkt
1 rcv pkt1 send ack1
ack1 ack1
send ack1
X
loss timeout
resend pkt1 pkt1
rcv pkt1
timeout
resend pkt1 pkt1 rcv ack1 pkt0 (detect duplicate)
rcv pkt1 send pkt0 send ack1
(detect duplicate) ack1
ack1 send ack1 rcv ack1 rcv pkt0
rcv ack1 ack0 send ack0
pkt0 send pkt0 pkt0
send pkt0 rcv pkt0
rcv pkt0 ack0 (detect duplicate)
ack0 send ack0 send ack0

(c) ACK loss (d) premature timeout/ delayed ACK

Transport Layer 3-41


Performance of rdt3.0
 rdt3.0 is correct, but performance stinks
 e.g.: 1 Gbps link, 15 ms prop. delay, 8000 bit
packet:
L 8000 bits
Dtrans = R = 9 = 8 microsecs
10 bits/sec

 U sender: utilization – fraction of time sender busy


sending L/R .008
U = 0.00027
sender = =
30.008
RTT + L / R
 if RTT=30 msec, 1KB pkt every 30 msec:
33kB/sec thruput over 1 Gbps link
 network protocol limits use of physical resources!

Transport Layer 3-42


rdt3.0: stop-and-wait
operation
sender receive
r
first packet bit transmitted, t
last packet bit transmitted, =
t=0
L/R

first packet bit arrives


RTT last packet bit arrives, send
ACK

ACK arrives, send next


packet, t = RTT + L / R

U L/R .008
sender = = = 0.00027
RTT + L / R 30.008

Transport Layer 3-43


Pipelined protocols
pipelining: sender allows multiple, “in-flight”,
yet-to-be-acknowledged pkts
• range of sequence numbers must be increased
• buffering at sender and/or receiver

 two generic forms of pipelined protocols: go-Back-


N, selective repeat

Transport Layer 3-44


Pipelining: increased
utilization
sender receiver
first packet bit transmitted,
t=
last bit transmitted, 0
t=
L/R

first packet bit arrives


RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send
last
ACK bit of 3rd packet arrives, send
ACK arrives, send next ACK
packet, t = RTT + L / R
3-packet pipelining increases
utilization by a factor of 3!

U 3L / R .0024
sender = = = 0.00081
RTT + L / R 30.008

Transport Layer 3-45


小结
 rdt1.0 underlying channel perfectly reliable
• 发方只管发,收方只管收,不用考虑出错、丢包、乱序等问题
 rdt2.0: channel with bit errors
• 引入 checksum ,同时多了 feedback 过程: ACK/NAK
• 缺陷: ACK/NAK 出错怎么办?发方不能确认接收方状态,只能重传 - 重复包
 rdt2.1: sender, handles garbled ACK/NAKs
• 增加序号,解决重复包问题
• 停等协议 stop and wait :因只有两种状态, 1 个比特的序号即可
 rdt2.2: a NAK-free protocol
• 不用 NAK ,只用 ACK ,对最后一个正确收到的包进行确认
• 缺陷:没考虑丢包问题
 rdt3.0: channels with errors and loss
• 解决丢包——引入 countdown timer
• 缺陷:计时器过早超时,引起多个重复包 + 停等协议- > 效率低
 Pipelined protocols 流水线协议
• 1 、增加序号
• 2 、增加发方和接收方缓存
• 3 、考虑对丢失、损坏和延时的处理: go-Back-N 和 selective repeat Transport Layer 3-46
Pipelined protocols:
overview
Go-back-N: ( GBN ) Selective Repeat:(SR)
 sender can have up to  sender can have up to N
N unacked packets in unack’ed packets in
pipeline pipeline
 receiver only sends  rcvr sends individual ack
cumulative ack for each packet
• doesn’t ack packet if
there’s a gap
 sender has timer for  sender maintains timer
oldest unacked packet for each unacked packet
• when timer expires, • when timer expires,
retransmit all unacked retransmit only that
packets unacked packet

Transport Layer 3-47


Go-Back-N: sender
 k-bit seq # in pkt header (TCP is 32 bit)
 “window” of up to N, consecutive unack’ed pkts
allowed

 ACK(n): ACKs all pkts up to, including seq # n -


“cumulative ACK”
• may receive duplicate ACKs (see receiver)
 timer for oldest in-flight pkt
 timeout(n): retransmit packet n and all higher seq #
pkts in window
Transport Layer 3-48
GBN: sender extended FSM
rdt_send(data)
if (nextseqnum < base+N) { ‘in window size
sndpkt[nextseqnum] =
make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum) ‘all sent pkts not be ack’ed
start_timer
nextseqnum++
L }
else
base=1
refuse_data(data) ‘window is full
nextseqnum=
1 timeout
start_timer
Wait
udt_send(sndpkt[base])
rdt_rcv(rcvpkt) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnu
m-1])
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_timer
Transport Layer 3-49
GBN: receiver extended
FSM
defau
lt
udt_send(sndpk rdt_rcv(rcvpkt)
t) && notcurrupt(rcvpkt)
L &&
Wait hasseqnum(rcvpkt,expectedseqnum)
expectedseqnum=1 extract(rcvpkt,data)
sndpkt = deliver_data(data)
sndpkt =
make_pkt(expectedseqnum,ACK,chks make_pkt(expectedseqnum,ACK,chksum)
um) udt_send(sndpkt)
expectedseqnum++

ACK-only: always send ACK for correctly-received


pkt with highest in-order seq #
• may generate duplicate ACKs
• need only remember expectedseqnum
 out-of-order pkt:
• discard (don’t buffer): no receiver buffering!
• re-ACK pkt with highest in-order seq #

Transport Layer 3-50


GBN in action
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
send pkt2 receive pkt0, send ack0
012345678
send pkt3 Xloss receive pkt1, send ack1
012345678
(wait)
receive pkt3, discard,
012345678 rcv ack0, send pkt4 (re)send ack1
012345678 rcv ack1, send pkt5 receive pkt4, discard,
(re)send ack1
ignore duplicate ACK receive pkt5, discard,
(re)send ack1
pkt 2 timeout
012345678 send pkt2
012345678 send pkt3
012345678 send pkt4 rcv pkt2, deliver, send ack2
012345678 send pkt5 rcv pkt3, deliver, send ack3
rcv pkt4, deliver, send ack4
rcv pkt5, deliver, send ack5
A single packet error can cause GBN to
retransmit a large number of packets, Transport Layer 3-51
many unnecessarily.
3.4.4 Selective repeat
 receiver individually acknowledges all
correctly received pkts
• buffers pkts, as needed, for eventual in-
order delivery to upper layer
 sender only resends pkts for which
ACK not received
• sender timer for each unACKed pkt
 sender window
• N consecutive seq #’s
• limits seq #s of sent, unACKed pkts

Transport Layer 3-52


Selective repeat: sender, receiver
windows

Transport Layer 3-53


Selective repeat
sender receiver
data from above: pkt n in [rcvbase, rcvbase+N-
 if next available seq # in 1]
window, send pkt  send ACK(n)
 out-of-order: buffer
timeout(n):
 resend pkt n, restart  in-order: deliver (also
timer deliver buffered, in-
order pkts), advance
ACK(n) in window to next not-yet-
[sendbase,sendbase+N]: received pkt
 mark pkt n as received
 if n smallest unACKed pkt n in [rcvbase-
N,rcvbase-1]
pkt, advance window  ACK(n)
base to next unACKed
seq # otherwise:
 ignore

Transport Layer 3-54


Selective repeat in action
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
send pkt2 receive pkt0, send ack0
012345678
send pkt3 Xloss receive pkt1, send ack1
012345678
(wait)
receive pkt3, buffer,
012345678 rcv ack0, send pkt4 send ack3
012345678 rcv ack1, send pkt5 receive pkt4, buffer,
send ack4
record ack3 arrived receive pkt5, buffer,
send ack5
pkt 2 timeout
012345678 send pkt2
012345678 record ack4 arrived
012345678 rcv pkt2; deliver pkt2,
record ack5 arrived
012345678 pkt3, pkt4, pkt5; send ack2

Q: what happens when ack2 arrives?

Transport Layer 3-55


sender window receiver window
Selective repeat: (after receipt) (after receipt)

dilemma 0 1 2 3 0 1 2 pkt0
pkt1 0123012
0123012
0123012 pkt2 0123012
example: 0123012 pkt3
0123012

 seq #’s: 0, 1, 2, 3 0123012


X
 window size=3 pkt0 will accept packet
with seq number 0
 receiver sees no (a) no problem

difference in two receiver can’t see sender side.


scenarios! receiver behavior identical in both cases!
something’s (very) wrong!
 duplicate data
accepted as new in 0123012 pkt0
(b) 0123012 pkt1 0123012
0123012 pkt2 0123012
Q: what relationship X 0123012
between seq # size timeout
X
and window size to retransmit pkt0 X
pkt0
avoid problem in 0123012
will accept packet
(b)? (b) oops!
with seq number 0

Transport Layer 3-56


Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-57
TCP: Overview RFCs: 793,1122,1323, 2018,
2581

 point-to-point:  full duplex data:


• one sender, one • bi-directional data flow in
receiver same connection
 reliable, in-order byte • MSS: maximum segment size
steam:  connection-oriented:
• no “message • handshaking (exchange of
boundaries” control msgs) inits sender,
 pipelined: receiver state before data
exchange
• TCP congestion and
• different with circuit switch
flow control set window
size  flow controlled:
• sender will not overwhelm
receiver

Transport Layer 3-58


TCP segment structure
32 bits
URG: urgent data counting by
(generally not used) source port # dest port #
bytes of data
sequence number (not
ACK: ACK #
valid acknowledgement segments!)
U A Pnumber
head not
PSH: push data now len used
R S F receive window
(generally not used) # bytes
checksum Urg data pointer
rcvr willing
RST, SYN, FIN: to accept
options (variable length)
connection establish
(setup, teardown
commands)
application
data
Internet
(variable length)
checksum
(as in UDP)

Transport Layer 3-59


TCP seq. numbers, ACKs
outgoing segment from sender
sequence numbers: source port # dest port #
sequence
• byte stream “number” of number
acknowledgement
first byte in segment’s numberrwnd
data checksum urg pointer

acknowledgements: window size


• seq # of next byte N
expected from other side
• cumulative ACK
sender sequence number space
Q: how receiver handles
out-of-order segments
sent sent, not- usable not
• A: TCP spec doesn’t say, ACKed yet ACKed but not usable
- up to implementor (“in- yet sent
flight”)
incoming segment to sender
source port # dest port #
sequence
number
acknowledgement
number
A rwnd
checksum urg pointer

Transport Layer 3-60


TCP seq. numbers, ACKs
Host A Host B

User
types
‘C’
Seq=42, ACK=79, data = ‘C’
host ACKs
receipt of
‘C’, echoes
Seq=79, ACK=43, data = ‘C’ back ‘C’
host ACKs
receipt
of echoed piggybacked
‘C’ Seq=43, ACK=80 ACK

simple telnet scenario

Transport Layer 3-61


TCP round trip time,
timeout
Q: how to set TCP Q: how to estimate RTT?
 SampleRTT: measured time
timeout value? from segment transmission
 longer than RTT until ACK receipt
• but RTT varies • ignore retransmissions
 too short: premature  SampleRTT will vary, want
estimated RTT “smoother”
timeout, unnecessary • average several recent
retransmissions measurements, not just
 too long: slow reaction current SampleRTT
to segment loss

Transport Layer 3-62


TCP round trip time,
timeout
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
 exponential weighted moving average
 influence of past sample decreases exponentially
fast
 typical value:  = RTT:
0.125gaia.cs.umass.edu to fantasia.eurecom.fr

350

RTT: gaia.cs.umass.edu to fantasia.eurecom.fr


RTT (milliseconds)

300

250
RTT (milliseconds)

200

sampleRTT
150

EstimatedRTT

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
time Transport Layer 3-63
SampleRTT Estimated RTT
TCP round trip time,
timeout
 timeout interval: EstimatedRTT plus “safety
margin”
• large variation in EstimatedRTT -> larger safety margin
 estimate
DevRTT =SampleRTT deviation
(1-)*DevRTT + from EstimatedRTT:
*|SampleRTT-EstimatedRTT|
(typically,  = 0.25)

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

* Check out the online interactive exercises for


more examples: Transport Layer 3-64
http://gaia.cs.umass.edu/kurose_ross/interactive/
Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-65
TCP reliable data transfer
 TCP creates rdt service
on top of IP’s
unreliable service
• pipelined segments
• cumulative acks
• single retransmission
timer
 retransmissions let’s initially
triggered by: consider
• timeout events simplified TCP
• duplicate acks sender:
• ignore duplicate
acks Transport Layer 3-66
TCP sender events:
data rcvd from app: timeout:
 create segment with seq #  retransmit segment
 seq # is byte-stream that caused timeout
number of first data byte  restart timer
in segment ack rcvd:
 start timer if timer not  if ack acknowledges
already running previously unacked
• think of timer as for oldest segments
unacked segment • update what is known to
• expiration interval: be ACKed
TimeOutInterval • start timer if there are
still unacked segments

Transport Layer 3-67


TCP sender (simplified)
data received from application above
create segment, seq. #: NextSeqNum
pass segment to IP (i.e., “send”)
NextSeqNum = NextSeqNum + length(data)
if (timer currently not running)
L start timer
NextSeqNum = InitialSeqNum wait
SendBase = InitialSeqNum for
event timeout
retransmit not-yet-acked
segment with
smallest seq. #
ACK received, with ACK field value y start timer

if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */
if (there are currently not-yet-acked segments)
start timer
else stop timer
} Transport Layer 3-68
TCP: retransmission
scenarios
Host A Host B Host A Host B

SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data
timeout

timeout
Seq=100, 20 bytes of data
ACK=100
X
ACK=100
ACK=120

Seq=92, 8 bytes of data Seq=92, 8


SendBase=100 bytes of data
SendBase=120
ACK=100
ACK=120

SendBase=120

lost ACK scenario premature timeout


Transport Layer 3-69
TCP: retransmission
scenarios
Host A Host B

Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

ACK=100
X
ACK=120

Seq=120, 15 bytes of data

cumulative ACK
Transport Layer 3-70
TCP ACK generation [RFC 1122, RFC
2581]

event at receiver TCP receiver action


arrival of in-order segment withdelayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

arrival of in-order segment withimmediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

Transport Layer 3-71


TCP fast
retransmit
 time-out period often
relatively long: TCP fast retransmit
• long delay before
resending lost packet if sender receives 4
 detect lost segments ACKs for same data
via duplicate ACKs. (“triple duplicate ACKs”),
• sender often sends resend unacked
many segments back- segment with
to-back smallest seq #
• if segment is lost, there  likely that unacked
will likely be many segment lost, so don’t
duplicate ACKs. wait for timeout

Transport Layer 3-72


TCP fast
retransmitHost A Host B

Seq=92, 8 bytes of data


Seq=100, 20 bytes of data
X

ACK=100
timeout

ACK=100
ACK=100
ACK=100
Seq=100, 20 bytes of data

fast retransmit after sender


receipt of triple duplicate ACK
Transport Layer 3-73
Transport Layer 3-74
(假设报文段 N-1 成功到达) 丢包会造成接收端发送冗余 ACK
乱序也会造成接收端发送冗余 ACK
冗余 ACK 是由于乱序造成的?还是包丢失造成的?
3 次冗余 ACK 是判定丢失的一种估计

从罗列的情况可看出:
• 在报文未丢失情况下,有 40% 的可能出现 3 次冗余 ACK
• 在乱序情况下,必定出现 2 次冗余 ACK
• 在丢失情况下,必定出现 3 次冗余 ACK

基于这样的概率,选定 3 次冗余 ACK 作为阈值是合理的。在实际抓包


中,大多数的快速重传都会在大于 3 次冗余 ACK 后发生。

Transport Layer 3-75


Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-76
TCP flow control
application
application may process
remove data from application
TCP socket buffers ….
TCP socket OS
receiver buffers
… slower than TCP
receiver is
delivering TCP
(sender is code
sending)

IP
flow control code
receiver controls sender,
so sender won’t overflow
receiver’s buffer by from sender
transmitting too much,
receiver protocol stack
too fast

Transport Layer 3-77


TCP flow control
 receiver “advertises” free
buffer space by including to application process
rwnd value in TCP header
of receiver-to-sender
segments RcvBuffer buffered data
• RcvBuffer size set via
socket options (typical rwnd free buffer space
default is 4096 bytes)
• many operating systems
autoadjust RcvBuffer TCP segment payloads
 sender limits amount of
unacked (“in-flight”) data
receiver-side buffering
to receiver’s rwnd value
 guarantees receive buffer
will not overflow
Transport Layer 3-78
Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-79
Connection Management
before exchanging data, sender/receiver
“handshake”:
 agree to establish connection (each knowing the other
willing to establish connection)
 agree on connection parameters
application application

connection state: connection state:


ESTAB ESTAB
connection variables: connection Variables:
seq # client-to- seq # client-to-
server server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
network
at server,client network
at server,client

Socket clientSocket = Socket connectionSocket =


newSocket("hostname","port welcomeSocket.accept();
number");
Transport Layer 3-80
Agreeing to establish a
connection
2-way handshake:
Q: will 2-way
handshake always
Let’s talk work in network?
OK
ESTAB  variable delays
ESTAB
 retransmitted messages
(e.g. req_conn(x)) due to
message loss
 message reordering
choose x
req_conn(x)  can’t “see” other side
ESTAB
acc_conn(x)
ESTAB

Transport Layer 3-81


Agreeing to establish a
connection
2-way handshake failure scenarios:

choose x choose x
req_conn(x) req_conn(x)
ESTAB ESTAB
retransmit acc_conn(x) retransmit acc_conn(x)
req_conn( req_conn(
x) x)
ESTAB ESTAB
data(x+1) accept
req_conn(x)
retransmit data(x+1
data(x+1) )
connection connection
client x completes server x completes server
client
terminat forgets x terminat forgets x
es req_conn(x)
es

ESTAB ESTAB
data(x+1) accept
half open connection! data(x+1
(no client!) )
Transport Layer 3-82
TCP 3-way handshake

client state server state


LISTEN LISTEN
choose init seq num, x
send TCP SYN msg
SYNSENT SYNbit=1, Seq=x
choose init seq num, y
send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
received SYNACK(x)
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1, ACKnum=y+1
client-to-server data
received ACK(y)
indicates client is live
ESTAB

Transport Layer 3-83


TCP 3-way handshake:
FSM
closed

Socket connectionSocket =
welcomeSocket.accept();

L Socket clientSocket =
SYN(x) newSocket("hostname","port
number");
SYNACK(seq=y,ACKnum=x+1)
create new socket for listen SYN(seq=x)
communication back to client

SYN SYN
rcvd sent

SYNACK(seq=y,ACKnum=x+1)
ESTAB ACK(ACKnum=y+1)
ACK(ACKnum=y+1)
L

Transport Layer 3-84


TCP: closing a connection
 client, server each close their side of
connection
• send TCP segment with FIN bit = 1
 respond to received FIN with ACK
• on receiving FIN, ACK can be combined
with own FIN
 simultaneous FIN exchanges can be
handled

Transport Layer 3-85


TCP: closing a connection
client state server state
ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1
can still
FIN_WAIT_2 wait for server send data
close

LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime

CLOSED

Transport Layer 3-86


Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-87
Principles of congestion
control
congestion:
 informally: “too many sources
sending too much data too fast for
network to handle”
 different from flow control!
 manifestations:
• lost packets (buffer overflow at routers)
• long delays (queueing in router buffers)
 a top-10 problem!

Transport Layer 3-88


Causes/costs of congestion:
scenario 1
original data: lin throughput:
 two senders, two
receivers Host A
lout
 one router, infinite unlimited
buffers shared output
link buffers
 no retransmission
 output link capacity: R
Host B

R/2

delay
lout

lin R/2 lin R/2


 maximum per-  large delays as arrival rate,
connection lin, approaches capacity
throughput: R/2 Transport Layer 3-89
Causes/costs of congestion:
scenario 2
 one router, finite buffers -- loss
 sender retransmission of timed-out
packet
• application-layer input = application-layer

output:
lin : original lin =
lout l'in: original data,
data lout

• transport-layer input includes


plus retransmitted
data
retransmissions
Host A : lin lin

finite shared
Host B
output link buffers
Transport Layer 3-90
Causes/costs of congestion:
scenario 2
idealization: perfect R/2

knowledge

lout
 sender know whether the
router buffers is free
 sender sends only when lin R/2
router buffers available
lin : original
copy l'in: original data,
data lout
plus retransmitted
data
A free buffer space!

finite shared
Host B
output link buffers
Transport Layer 3-91
Causes/costs of congestion:
scenario 2
Idealization: known
loss packets can be
lost, dropped at router
due to full buffers
 sender only resends if
packet known to be lost
lin : original
copy lout
l'in: original data,
data
plus retransmitted
data
A no buffer space!

Host B
Transport Layer 3-92
Causes/costs of congestion:
scenario 2
Idealization: R/2

known loss packets when sending at


R/2, some packets
can be lost, dropped at

lout
are
router due to full retransmissions
but asymptotic
buffers goodput is still R/2
R/2
 sender only resends if lin (why?)

packet known to lbe: original


lost in
datal' : original data,
lout
in
plus retransmitted
data
A free buffer space!

Host B
Transport Layer 3-93
Causes/costs of congestion:
scenario 2
Realistic: duplicates R/2
 packets can be lost,
dropped at router due to when sending at
R/2, some packets

lout
full buffers are
 sender times out retransmissions
including
prematurely, sending two R/2
duplicated that
lin
copies, both of which are are delivered!

delivered lin
copy
timeo lout
ut l'in

A free buffer space!

Host B
Transport Layer 3-94
Causes/costs of congestion:
scenario 2
Realistic: duplicates R/2
 packets can be lost,
dropped at router due to when sending at
R/2, some packets

lout
full buffers are
 sender times out retransmissions
including
prematurely, sending two R/2
duplicated that
lin
copies, both of which are are delivered!

delivered
“costs” of congestion:
 more work (retrans) for given “goodput”
 unneeded retransmissions: link carries multiple
copies of pkt
• decreasing goodput

Transport Layer 3-95


Causes/costs of congestion:
scenario 3
 four senders Q: what happens as lin and lin’
increase ?
 multihop paths
A: as red lin’ increases, all
 timeout/retransmit arriving blue pkts at upper
queue are dropped, blue
throughput g 0
Host
lin : original lout
A Host
l'in: original data,
data B
plus retransmitted
finitedata
shared
output link
buffers

Host
D Host
C

Transport Layer 3-96


Causes/costs of congestion:
scenario 3
C/2
lout

lin’ C/2

another “cost” of congestion:


 when packet dropped, any “upstream
transmission capacity used for that packet was
wasted!

Transport Layer 3-97


Chapter 3 outline
3.1 transport-layer services
3.5 connection-
oriented
3.2 multiplexing and demultiplexing
transport:
3.3 connectionless transport: UDP TCP
• segment
3.4 principles of reliablestructure
data
transfer • reliable data
transfer
• flow control
• connection
management
3.6 principles of
congestion Transport Layer 3-98
TCP congestion control: additive
increase multiplicative decrease
 approach: sender increases transmission rate
(window size), probing for usable bandwidth, until
loss occurs
• additive increase: increase cwnd by 1 MSS every
RTT until loss detected
• multiplicative decrease: cut cwnd in half after loss
additively increase window size …
…. until loss occurs (then cut window in half)
congestion window size
cwnd: TCP sender

AIMD saw tooth


behavior: probing
for bandwidth

time
Transport Layer 3-99
TCP Congestion Control:
details
sender sequence number space
cwnd

TCP sending rate:


 roughly: send cwnd
last byte
bytes, wait RTT for
last byte
ACKed sent, not- sent ACKS, then send more
yet ACKed
(“in-flight”) bytes
cwnd
 sender limits transmission: rate ~
~ bytes/sec
RTT
LastByteSent- < cwnd
LastByteAcked
 cwnd is dynamic, function of
perceived network congestion

Transport Layer 3-100


TCP Slow Start
Host A Host B
 when connection begins,
increase rate
exponentially until first o n e s e gm
loss event: e nt

RTT
• initially cwnd = 1 MSS
• double cwnd every RTT two segm
ents
• done by incrementing cwnd
for every ACK received
 summary: initial rate is four segm
e nt
slow but ramps up s

exponentially fast

time

Transport Layer 3-101


TCP: detecting, reacting to
loss
 loss indicated by timeout:
• cwnd set to 1 MSS;
• window then grows exponentially (as in
slow start) to threshold, then grows
linearly
 loss indicated by 3 duplicate ACKs: TCP
RENO
• dup ACKs indicate network capable of
delivering some segments
• cwnd is cut in half window then grows
linearly
 TCP Tahoe always sets cwnd to 1
(timeout or 3 duplicate acks) Transport Layer 3-102
TCP: switching from slow start
to CA
Q: when should the
exponential
increase switch to
linear?
A: when cwnd gets to
1/2 of its value
before timeout.

Implementation:
 variable ssthresh
 on loss event, Fast recovery
ssthresh is set to 1/2
of cwnd just before
loss event
* Check out the online interactive exercises for
more examples: Transport Layer 3-103
http://gaia.cs.umass.edu/kurose_ross/interactive/
Summary: TCP Congestion
Control New
New ACK!
duplicate ACK ACK!
dupACKcount++ new ACK
new ACK
.
cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount = 0
cwnd = cwnd+MSS transmit new segment(s), as allowed
dupACKcount = 0
L transmit new segment(s), as allowed
cwnd = 1 MSS
ssthresh = 64 KB cwnd > ssthresh
dupACKcount = 0
slow L congestion
start timeout avoidance
ssthresh = cwnd/2
cwnd = 1 MSS duplicate ACK
timeout dupACKcount = 0 dupACKcount++
ssthresh = cwnd/2 retransmit missing segment
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment New
timeout
ACK!
ssthresh = cwnd/2
cwnd = 1 New ACK
dupACKcount = 0
cwnd = ssthresh dupACKcount == 3
dupACKcount == 3 retransmit missing segment dupACKcount = 0
ssthresh= cwnd/2 ssthresh= cwnd/2
cwnd = ssthresh + 3 cwnd = ssthresh + 3
retransmit missing segment retransmit missing segment
fast
recovery
duplicate ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed

Transport Layer 3-104


TCP throughput
 avg. TCP thruput as function of window size,
RTT?
• ignore slow start, assume always data to send
 W: window size (measured in bytes) where loss occurs
• avg. window size (# in-flight bytes) is ¾ W
• avg. thruput is 3/4W per3RTT
W
avg TCP thruput = bytes/sec
4 RTT

W/2

Transport Layer 3-105


TCP Futures: TCP over “long, fat
pipes”
 example: 1500 byte segments, 100ms RTT,
want 10 Gbps throughput
 requires W = 83,333 in-flight segments
 throughput in terms of segment loss
probability (L) [Mathis 1997]:
1.22. MSS
TCP throughput =
RTT L

➜ to achieve 10 Gbps throughput, need a loss rate


of L = 2·10-10 – a very small loss rate!
 new versions of TCP for high-speed

Transport Layer 3-106


TCP Fairness
fairness goal: if K TCP sessions share
same bottleneck link of bandwidth R,
each should have average rate of R/K
TCP connection 1

bottleneck
router
capacity R
TCP connection 2

Transport Layer 3-107


Why is TCP fair?
two competing sessions:
 additive increase gives slope of 1, as throughout
increases
 multiplicative decrease decreases throughput
proportionally
R equal bandwidth share
Connection 2 throughput

loss: decrease window by factor of 2


congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase

Connection 1 throughput R

Transport Layer 3-108


Fairness (more)
Fairness and UDP Fairness, parallel TCP
 multimedia apps often connections
do not use TCP  application can open
• do not want rate multiple parallel
throttled by congestion connections between two
control
 instead use UDP: hosts
• send audio/video at
 web browsers do this
constant rate, tolerate  e.g., link of rate R with 9
packet loss existing connections:
• new app asks for 1 TCP, gets
rate R/10
• new app asks for 11 TCPs, gets
R/2

Transport Layer 3-109


Explicit Congestion Notification
(ECN)
network-assisted congestion control:
 two bits in IP header (ToS field) marked by network
router to indicate congestion
 congestion indication carried to receiving host
 receiver (seeing congestion indication in IP
datagram) ) sets ECE bit on receiver-to-sender ACK
segment to notify sender of congestion
TCP ACK segment
source destination
applicatio applicatio
ECE=1
n n
transport transport
network network
link link
physical physical
ECN=00 ECN=11

IP datagram
Transport Layer 3-110
 ECN
• 00 :发送主机不支持 ECN
• 01 或者 10 :发送主机支持 ECN
• 11 :路由器正在经历拥塞
 应用
• 一个支持 ECN 的主机发送数据包时将 ECN 设为 01
或 10 。如果路径上的路由器支持 ECN 并经历拥塞,它
将 ECN 域设置为 11 。
• 如果该数值已被设为 11 ,那么下游路由器不会修改该值
 TCP 对 ECN 的支持
• 当一个 IP 包的 ECN 域被路由器设置为 11 时,接收端
而非发送端被通知路径上发生了拥塞。
• ECN 使用 TCP 头部来告知发送端网络正在经历拥塞,
并且告知接收端发送段已收到接收端发来的拥塞通告,已降低
了发送速率。

Transport Layer 3-111


Chapter 3: summary
 principles behind transport
layer services:
• multiplexing,
demultiplexing
• reliable data transfer
• flow control next:
• congestion control  leaving the network
 instantiation, implementation “edge” (application,
in the Internet transport layers)
• UDP  into the network “core”
• TCP  two network layer
chapters:
• data plane
• control plane
Transport Layer 3-112

You might also like