0% found this document useful (0 votes)
4 views83 pages

TCP Part 2

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 83

Announcements

v Tutorial 1 in Week 5
§ Problem solving prep for exam
v Assignment 1
§ Have you started?
§ Do not delay
§ Be careful about plagiarism
§ Read specification thoroughly
§ Post questions on forum
v Mid-semester Exam in Week 6
§ Monday, 29th August during regular lecture hours
§ Details at end of slide set

Transport Layer (contd.) 2


Transport Layer Outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and § segment structure
demultiplexing § reliable data transfer
3.3 connectionless § flow control
transport: UDP § connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
Pipelined protocols 3.7 TCP congestion control

Transport Layer 3
Self Study

Practice Problem: RDT

http://www-net.cs.umass.edu/kurose_ross/interactive/rdt22.php

Transport Layer 4
Pipelined protocols
pipelining: sender allows multiple, “in-flight”, yet-
to-be-acknowledged pkts
§ range of sequence numbers must be increased
§ buffering at sender and/or receiver

v two generic forms of pipelined (sliding window)


protocols: go-Back-N, selective repeat
Transport Layer 5
Pipelining: increased utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
3-packet pipelining increases
utilization by a factor of 3!

3L / R 3 x 125
U sender = = 100+125 = 1.67
RTT + L / R

Transport Layer 6
Pipelined protocols: overview
Go-back-N: Selective Repeat:
v sender can have up to v sender can have up to N
N unacked packets in unack’ed packets in
pipeline pipeline
v receiver only sends v rcvr sends individual ack
cumulative ack for each packet
§ doesn’t ack packet if
there’s a gap
v sender has timer for v sender maintains timer
oldest unacked packet for each unacked packet
§ when timer expires, § when timer expires,
retransmit all unacked retransmit only that
packets unacked packet

Transport Layer 7
Go-Back-N: sender
v k-bit seq # in pkt header
v “window” of up to N, consecutive unack’ed pkts allowed

v ACK(n): ACKs all pkts up to, including seq # n - “cumulative


ACK”
§ may receive duplicate ACKs (see receiver)
v timer for oldest in-flight pkt
v timeout(n): retransmit packet n and all higher seq # pkts in
window
Applet: http://media.pearsoncmg.com/aw/aw_kurose_network_2/applets/go-back-n/go-back-n.html
Transport Layer 8
GBN: sender extended FSM
rdt_send(data)
if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
Λ else
refuse_data(data)
base=1
nextseqnum=1
timeout
start_timer
Wait udt_send(sndpkt[base])
rdt_rcv(rcvpkt) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnum-1]
)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_timer
Transport Layer 9
GBN: receiver extended FSM
default
udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
Λ && hasseqnum(rcvpkt,expectedseqnum)
expectedseqnum=1 Wait extract(rcvpkt,data)
sndpkt = deliver_data(data)
make_pkt(expectedseqnum,ACK,chksum) sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++

ACK-only: always send ACK for correctly-received


pkt with highest in-order seq #
§ may generate duplicate ACKs
§ need only remember expectedseqnum
v out-of-order pkt:
§ discard (don’t buffer): no receiver buffering!
§ re-ACK pkt with highest in-order seq #
Transport Layer 10
GBN in action
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
send pkt2 receive pkt0, send ack0
012345678
send pkt3 Xloss receive pkt1, send ack1
012345678
(wait)
receive pkt3, discard,
012345678 rcv ack0, send pkt4 (re)send ack1
012345678 rcv ack1, send pkt5 receive pkt4, discard,
(re)send ack1
ignore duplicate ACK receive pkt5, discard,
(re)send ack1
pkt 2 timeout
012345678 send pkt2
012345678 send pkt3
012345678 send pkt4 rcv pkt2, deliver, send ack2
012345678 send pkt5 rcv pkt3, deliver, send ack3
rcv pkt4, deliver, send ack4
rcv pkt5, deliver, send ack5

Transport Layer 11
Selective repeat
v receiver individually acknowledges all correctly
received pkts
§ buffers pkts, as needed, for eventual in-order delivery
to upper layer
v sender only resends pkts for which ACK not
received
§ sender timer for each unACKed pkt
v sender window
§ N consecutive seq #’s
§ limits seq #s of sent, unACKed pkts

Applet: http://media.pearsoncmg.com/aw/aw_kurose_network_3/applets/SelectRepeat/SR.html

Transport Layer 12
Selective repeat: sender, receiver windows

Transport Layer 13
Selective repeat
sender receiver
data from above: pkt n in [rcvbase, rcvbase+N-1]
v if next available seq # in v send ACK(n)
window, send pkt v out-of-order: buffer
timeout(n): v in-order: deliver (also
v resend pkt n, restart deliver buffered, in-order
timer pkts), advance window to
next not-yet-received pkt
ACK(n) in [sendbase,sendbase+N]:
v mark pkt n as received
pkt n in [rcvbase-N,rcvbase-1]
v if n smallest unACKed
v ACK(n)
pkt, advance window base otherwise:
to next unACKed seq # v ignore

Transport Layer 14
Selective repeat in action
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
send pkt2 receive pkt0, send ack0
012345678
send pkt3 Xloss receive pkt1, send ack1
012345678
(wait)
receive pkt3, buffer,
012345678 rcv ack0, send pkt4 send ack3
012345678 rcv ack1, send pkt5 receive pkt4, buffer,
send ack4
record ack3 arrived receive pkt5, buffer,
send ack5
pkt 2 timeout
012345678 send pkt2
012345678 record ack4 arrived
012345678 rcv pkt2; deliver pkt2,
record ack5 arrived
012345678 pkt3, pkt4, pkt5; send ack2

Q: what happens when ack2 arrives?

Transport Layer 15
sender window receiver window
Selective repeat: (after receipt) (after receipt)

dilemma 0123012 pkt0


pkt1 0123012
0123012

example: 0123012 pkt2 0123012


0123012
v seq #’s: 0, 1, 2, 3 0123012 pkt3
X
v window size=3 0123012
pkt0 will accept packet
v receiver sees no (a) no problem
with seq number 0

difference in two
receiver can’t see sender side.
scenarios! receiver behavior identical in both cases!
v duplicate data something’s (very) wrong!
accepted as new in
(b) 0123012 pkt0
0123012 pkt1 0123012
pkt2
Q: what relationship 0123012 0123012
X
between seq # size X
0123012

and window size to timeout


retransmit pkt0 X
avoid problem in (b)? 0123012 pkt0
will accept packet
with seq number 0
A: window size must be less than (b) oops!
or equal to half the size of the
Transport Layer 16
sequence number space
Observations
v With sliding windows, it is possible to fully utilize
a link (or path), provided the window size is large
enough. Throughput is ~ (n/RTT)
§ Stop & Wait is like n = 1.
v Sender has to buffer all unacknowledged packets,
because they may require retransmission
v Receiver may be able to accept out-of-order
packets, but only up to its buffer limits
v Implementation complexity depends on protocol
details (GBN vs. SR)

Transport Layer 17
Recap: components of a solution
v Checksums (for error detection)
v Timers (for loss detection)
v Acknowledgments
§ cumulative
§ selective
v Sequence numbers (duplicates, windows)
v Sliding Windows (for efficiency)

v Reliability protocols use the above to decide


when and what to retransmit or acknowledge
Transport Layer 18
Quiz: GBN vs. SR

v Which of the following is not true?

A. GBN uses cumulative ACKs, SR uses individual


ACKs
B. Both GBN and SR use timeouts to address
packet loss
C. GBN maintains a separate timer for each
outstanding packet
D. SR maintains a separate timer for each
outstanding packet
E. Neither GBN nor SR use NACKs
Transport Layer 19
Transport Layer Outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and § segment structure
demultiplexing § reliable data transfer
3.3 connectionless § flow control
transport: UDP § connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control

Transport Layer (contd.) 20


Practical Reliability Questions
v How do the sender and receiver keep track of
outstanding pipelined segments?
v How many segments should be pipelined?
v How do we choose sequence numbers?
v What does connection establishment and teardown
look like?
v How should we choose timeout values?

Transport Layer (contd.) 21


TCP: Overview RFCs: 793,1122,1323, 2018, 2581

v point-to-point: v full duplex data:


§ one sender, one receiver § bi-directional data flow
v reliable, in-order byte in same connection
steam: § MSS: maximum segment
size
§ no “message
boundaries” v connection-oriented:
v pipelined: § handshaking (exchange
of control msgs) inits
§ TCP congestion and flow sender, receiver state
control set window size before data exchange
v send and receive v flow controlled:
buffers
application § sender will not
application

overwhelm receiver
writes data reads data
socket socket
door door
TCP TCP
send buffer receive buffer
segment
Transport Layer (contd.) 22
TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used U A P R S F receive window
(generally not used) # bytes
checksum Urg data pointer rcvr willing
RST, SYN, FIN: to accept
options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)

Transport Layer (contd.) 23


TCP segment
TCP structure
Segments
32 bits

source port # dest port #


20 Bytes sequence number

(UDP was 8)
acknowledgement number
head not
len used
UAP R S F receive window
checksum Urg data pointer

options (variable length)

application
data
(variable length)

Transport Layer (contd.) 24


Transport Layer Outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and § segment structure
demultiplexing § reliable data transfer
3.3 connectionless § flow control
transport: UDP § connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control

Transport Layer (contd.) 25


Recall: Components of a solution for
reliable transport
v Checksums (for error detection)
v Timers (for loss detection)
v Acknowledgments
§ cumulative
§ selective
v Sequence numbers (duplicates, windows)
v Sliding Windows (for efficiency)
§ Go-Back-N (GBN)
§ Selective Replay (SR)
Transport Layer (contd.) 26
What does TCP do?

Many of our previous ideas, but some key


differences
v Checksum

Transport Layer (contd.) 27


TCP Header

Source port Destination port

Sequence number
Acknowledgment
Computed
over header HdrLen 0 Flags Advertised window
and data
Checksum Urgent pointer
Options (variable)

Data

Transport Layer (contd.) 28


What does TCP do?

Many of our previous ideas, but some key


differences
v Checksum
v Sequence numbers are byte offsets

Transport Layer (contd.) 29


TCP “Stream of Bytes” Service ..
Application @ Host A
Byte 0
Byte 1
Byte 2
Byte 3

Byte 80
Byte 0
Byte 1
Byte 2
Byte 3

Byte 80

Application @ Host B
Transport Layer (contd.) 30
.. Provided Using TCP “Segments”
Host A Byte 0
Byte 1
Byte 2
Byte 3

Byte 80
Segment sent when:
TCP Data 1. Segment full (Max Segment Size),
2. Not full, but times out

TCP Data
Host B
Byte 0
Byte 1
Byte 2
Byte 3

Byte 80

Transport Layer (contd.) 31


TCP Segment
IP Data
TCP Data (segment) TCP Hdr IP Hdr

v IP packet
§ No bigger than Maximum Transmission Unit (MTU)
§ E.g., up to 1500 bytes with Ethernet
v TCP packet
§ IP packet with a TCP header and data inside
§ TCP header ≥ 20 bytes long
v TCP segment
§ No more than Maximum Segment Size (MSS) bytes
§ E.g., up to 1460 consecutive bytes from the stream
§ MSS = MTU – (IP header) – (TCP header)
Transport Layer (contd.) 32
Sequence Numbers
ISN (initial sequence number)
k bytes

Host A

Sequence number
= 1st byte in segment =
ISN + k

Transport Layer (contd.) 33


Sequence Numbers
ISN (initial sequence number)
k

Host A

Sequence number TCP Data TCP


HDR
= 1st byte in segment =
ACK sequence number
ISN + k
= next expected byte
= seqno + length(data)
TCP
TCP Data HDR

Host B

Transport Layer (contd.) 34


What does TCP do?

Most of our previous tricks, but a few differences


v Checksum
v Sequence numbers are byte offsets
v Receiver sends cumulative acknowledgements (like GBN)

Transport Layer (contd.) 35


ACKing and Sequence Numbers
v Sender sends packet
§ Data starts with sequence number X
§ Packet contains B bytes [X, X+1, X+2, ….X+B-1]

v Upon receipt of packet, receiver sends an ACK


§ If all data prior to X already received:
• ACK acknowledges X+B (because that is next expected byte)
§ If highest in-order byte received is Y s.t. (Y+1) < X
• ACK acknowledges Y+1
• Even if this has been ACKed before

Transport Layer (contd.) 36


Normal Pattern
v Sender: seqno=X, length=B
v Receiver: ACK=X+B
v Sender: seqno=X+B, length=B
v Receiver: ACK=X+2B
v Sender: seqno=X+2B, length=B

v Seqno of next packet is same as last ACK field

Transport Layer (contd.) 37


Packet Loss
v Sender: seqno=X, length=B
v Receiver: ACK=X+B
v Sender: seqno=X+B, length=B LOST

v Sender: seqno=X+2B, length=B


v Receiver: ACK = X+B

Transport Layer (contd.) 38


TCP Header
Acknowledgment Source port Destination port
gives seqno just
beyond highest Sequence number
seqno received in
order Acknowledgment
(“What Byte HdrLen 0 Flags Advertised window
is Next”)
Checksum Urgent pointer
Options (variable)

Data

Transport Layer (contd.) 39


TCP seq. numbers, ACKs
outgoing segment from sender
sequence numbers: source port # dest port #
sequence number
§ byte stream “number” of acknowledgement number

first byte in segment’s checksum


rwnd
urg pointer
data window size
N

acknowledgements:
§ seq # of next byte sender sequence number space
expected from other side
sent sent, not- usable not
§ cumulative ACK ACKed yet ACKed but not usable
(“in- yet sent
flight”)
incoming segment to sender
source port # dest port #
sequence number
acknowledgement number
A rwnd
checksum urg pointer

Transport Layer (contd.) 40


PiggybackingPiggybacking
Client Server Client Server


v So
So far,
far, we’ve
we’ve assumed
assumed
distinct
distinct“sender”
“sender”and
and
“receiver” roles
“receiver” roles
v In reality, usually both
• sides of a connection
In reality, usually both
send
sidessome data
of a connection
send some data
– request/response is a
… …
common pattern Without With
Piggybacking Piggybacking

Transport Layer (contd.) 41


Quiz
Seq
= 10
1, 2
KBy
tes of d
ata

C K =? f d a ta ACK = 101 + 2048 = 2149


A B yte o
02 4 ,1K
1
Seq =

Seq
= ?,
2 KB
ACK ytes of Seq = 2149
=? data
ACK = 1024 + 1024 = 2048
Transport Layer (contd.) 42
What does TCP do?

Most of our previous tricks, but a few differences


v Checksum
v Sequence numbers are byte offsets
v Receiver sends cumulative acknowledgements (like GBN)
v Receivers can buffer out-of-sequence packets (like SR)

Transport Layer (contd.) 43


Loss with cumulative ACKs

v Sender sends packets with 100B and seqnos.:


§ 100, 200, 300, 400, 500, 600, 700, 800, 900, …

v Assume the fifth packet (seqno 500) is lost,


but no others

v Stream of ACKs will be:


§ 200, 300, 400, 500, 500, 500, 500,…

Transport Layer (contd.) 44


What does TCP do?

Most of our previous tricks, but a few differences


v Checksum
v Sequence numbers are byte offsets
v Receiver sends cumulative acknowledgements (like GBN)
v Receivers do not drop out-of-sequence packets (like SR)
v Sender maintains a single retransmission timer (like GBN) and
retransmits on timeout

Transport Layer (contd.) 45


TCP round trip time, timeout
Q: how to set TCP Q: how to estimate RTT?
timeout value? v SampleRTT: measured
time from segment
v longer than RTT transmission until ACK
§ but RTT varies receipt
v too short: premature § ignore retransmissions
timeout, unnecessary v SampleRTT will vary, want
retransmissions estimated RTT “smoother”
§ average several recent
v too long: slow reaction measurements, not just
to segment loss and current SampleRTT
connection has lower
throughput

Transport Layer (contd.) 46


TCP round trip time, timeout
EstimatedRTT = (1- α)*EstimatedRTT + α*SampleRTT
v exponential weighted moving average
v influence of past sample decreases exponentially fast
v typical value: α = 0.125 RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

350

RTT: gaia.cs.umass.edu to fantasia.eurecom.fr


(milliseconds)

300

250
RTT (milliseconds)
RTT

200

sampleRTT
150

EstimatedRTT

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
time (seconds) Transport Layer (contd.) 47
SampleRTT Estimated RTT
TCP round trip time, timeout
v timeout interval: EstimatedRTT plus “safety margin”
§ large variation in EstimatedRTT -> larger safety margin
v estimate SampleRTT deviation from EstimatedRTT:
DevRTT = (1-β)*DevRTT +
β*|SampleRTT-EstimatedRTT|
(typically, β = 0.25)

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

Practice Problem:
http://wps.pearsoned.com/ecs_kurose_compnetw_6/216/55463/14198700.cw/index.html
Transport Layer (contd.) 48
Why exclude retransmissions in RTT
computation?
v How do we differentiate between the real ACK, and ACK of
the retransmitted packet?

Sender Receiver Sender Receiver

Origin Origin
al Tran al Tran
smissi smissi
on on

ACK
Retra Retra
nsmis SampleRTT nsmis
sion sion
SampleRTT

ACK

Transport Layer (contd.) 49


TCP sender events: PUTTING IT
TOGETHER

data rcvd from app: timeout:


v create segment with v retransmit segment
seq # that caused timeout
v seq # is byte-stream v restart timer
number of first data ack rcvd:
byte in segment v if ack acknowledges
v start timer if not previously unacked
already running segments
§ think of timer as for § update what is known
oldest unacked to be ACKed
segment
§ start timer if there are
§ expiration interval: still unacked segments
TimeOutInterval

Transport Layer (contd.) 50


TCP sender (simplified) PUTTING IT
TOGETHER

data received from application above


create segment, seq. #: NextSeqNum
pass segment to IP (i.e., “send”)
NextSeqNum = NextSeqNum + length(data)
if (timer currently not running)
Λ start timer
NextSeqNum = InitialSeqNum wait
SendBase = InitialSeqNum for
event timeout
retransmit not-yet-acked segment
with smallest seq. #
start timer
ACK received, with ACK field value y
if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */
if (there are currently not-yet-acked segments)
start timer
else stop timer
} Transport Layer (contd.) 51
TCP: retransmission scenarios
Host A Host B Host A Host B

SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

timeout
ACK=100
X
ACK=100
ACK=120

Seq=92, 8 bytes of data Seq=92, 8


SendBase=100 bytes of data
SendBase=120
ACK=100
ACK=120

SendBase=120

lost ACK scenario premature timeout


Transport Layer (contd.) 52
TCP: retransmission scenarios
Host A Host B

Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


ACK=100
timeout

X
ACK=120

Seq=120, 15 bytes of data

cumulative ACK
Transport Layer (contd.) 53
TCP ACK generation [RFC 1122, RFC 2581]

event at receiver TCP receiver action


arrival of in-order segment with delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

arrival of in-order segment with immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

Transport Layer (contd.) 54


What does TCP do?

Most of our previous tricks, but a few differences


v Checksum
v Sequence numbers are byte offsets
v Receiver sends cumulative acknowledgements (like GBN)
v Receivers may not drop out-of-sequence packets (like SR)
v Sender maintains a single retransmission timer (like GBN) and
retransmits on timeout
v Introduces fast retransmit: optimisation that uses duplicate
ACKs to trigger early retransmission

55
TCP fast retransmit
v time-out period often
relatively long: TCP fast retransmit
§ long delay before if sender receives 3
resending lost packet
duplicate ACKs for
v “Duplicate ACKs” are a same data
sign of an isolated loss
(“triple duplicate ACKs”),
§ The lack of ACK
progress means that resend unacked
packet hasn’t been segment with smallest
delivered seq #
§ Stream of ACKs means § likely that unacked
some packets are being segment is lost, so
delivered don’t wait for timeout
§ Could trigger resend on
receiving “k” duplicate
56
ACKs (TCP uses k = 3) Transport Layer (contd.)
TCP fast retransmit
Host A Host B

Seq=92, 8 bytes of data


Seq=100, 20 bytes of data
X

ACK=100
timeout

ACK=100
ACK=100
ACK=100
Seq=100, 20 bytes of data

fast retransmit after sender


receipt of triple duplicate ACK
Transport Layer (contd.) 57
What does TCP do?

Most of our previous ideas, but some key


differences
v Checksum
v Sequence numbers are byte offsets
v Receiver sends cumulative acknowledgements (like GBN)
v Receivers do not drop out-of-sequence packets (like SR)
v Sender maintains a single retransmission timer (like GBN) and
retransmits on timeout
v Introduces fast retransmit: optimization that uses duplicate
ACKs to trigger early retransmission

Transport Layer (contd.) 58


Transport Layer Outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and § segment structure
demultiplexing § reliable data transfer
3.3 connectionless § flow control
transport: UDP § connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control

Transport Layer (contd.) 59


TCP flow control
application
application may process
remove data from application
TCP socket buffers ….
TCP socket OS
receiver buffers
… slower than TCP
receiver is delivering
(sender is sending) TCP
code

IP
flow control code
receiver controls sender, so
sender won’t overflow
receiver’s buffer by transmitting from sender
too much, too fast
receiver protocol stack

Transport Layer (contd.) 60


TCP flow control
v receiver “advertises” free
buffer space by including to application process
rwnd value in TCP header
of receiver-to-sender
segments RcvBuffer buffered data
§ RcvBuffer size set via
socket options (typical default rwnd free buffer space
is 4096 bytes)
§ many operating systems
autoadjust RcvBuffer TCP segment payloads
v sender limits amount of
unacked (“in-flight”) data to receiver-side buffering
receiver’s rwnd value
v guarantees receive buffer
will not overflow
http://media.pearsoncmg.com/aw/aw_kurose_network_4/applets/flow/FlowControl.htm
61
Transport Layer (contd.)
Transport Layer Outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and § segment structure
demultiplexing § reliable data transfer
3.3 connectionless § flow control
transport: UDP § connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control

Transport Layer (contd.) 62


Connection Management
before exchanging data, sender/receiver “handshake”:
v agree to establish connection (each knowing the other willing
to establish connection)
v agree on connection parameters

application application

connection state: ESTAB connection state: ESTAB


connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
at server,client at server,client

network network

Socket clientSocket = Socket connectionSocket =


newSocket("hostname","port welcomeSocket.accept();
number");

Transport Layer (contd.) 63


Initial Sequence Number (ISN)
v Sequence number for the very first byte

v Why not just use ISN = 0?

v Practical issue
§ IP addresses and port #s uniquely identify a connection
§ Eventually, though, these port #s do get used again
§ … small chance an old packet is still in flight
§ Easy to hijack a TCP connection (security threat)

v TCP therefore requires changing ISN

v Hosts exchange ISNs when they establish a


connection Transport Layer (contd.) 64
Agreeing to establish a connection

2-way handshake:
Q: will 2-way handshake
always work in
Let’s talk
network?
ESTAB v variable delays
OK
ESTAB v retransmitted messages
(e.g. req_conn(x)) due to
message loss
v message reordering
choose x
req_conn(x)
v can’t “see” other side
ESTAB
acc_conn(x)
ESTAB

Transport Layer (contd.) 65


Agreeing to establish a connection
2-way handshake failure scenarios:

choose x choose x
req_conn(x) req_conn(x)
ESTAB ESTAB
retransmit acc_conn(x) retransmit acc_conn(x)
req_conn(x) req_conn(x)

ESTAB ESTAB
data(x+1) accept
req_conn(x)
retransmit data(x+1)
data(x+1)
connection connection
client x completes server x completes server
client
terminates forgets x terminates forgets x
req_conn(x)

ESTAB ESTAB
data(x+1) accept
half open connection! data(x+1)
(no client!)
Transport Layer (contd.) 66
TCP 3-way handshake
A B
client state server state
CLOSED LISTEN
choose init seq num, x
send TCP SYN msg
SYNSENT SYNbit=1, Seq=x
choose init seq num, y
send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
received SYNACK(x)
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1, ACKnum=y+1
client-to-server data
received ACK(y)
indicates client is live
ESTAB

Transport Layer (contd.) 67


TCP 3-way handshake: FSM

closed

Socket connectionSocket =
welcomeSocket.accept();

Λ Socket clientSocket =
SYN(x) newSocket("hostname","port
number");
SYNACK(seq=y,ACKnum=x+1)
create new socket for SYN(seq=x)
communication back to client listen

SYN SYN
rcvd sent

SYNACK(seq=y,ACKnum=x+1)
ESTAB ACK(ACKnum=y+1)
ACK(ACKnum=y+1)
Λ

Transport Layer (contd.) 68


Step 1: A’s Initial SYN Packet

A’s port B’s port

A’s Initial Sequence Number


Flags: SYN
(Irrelevant since ACK not set)
ACK
FIN 5 0 Flags Advertised window
RST
PSH Checksum Urgent pointer
URG Options (variable)

A tells B it wants to open a connection…

Transport Layer (contd.)


Step 2: B’s SYN-ACK Packet

B’s port A’s port

B’s Initial Sequence Number


Flags: SYN
ACK = A’s ISN plus 1
ACK
FIN 5 0 Flags Advertised window
RST
PSH Checksum Urgent pointer
URG Options (variable)

B tells A it accepts, and is ready to hear the next byte…

… upon receiving this packet, A can start sending data


Transport Layer (contd.) 70
Step 3: A’s ACK of the SYN-ACK

A’s port B’s port

A’s Initial Sequence Number+1


Flags: SYN
B’s ISN plus 1
ACK
FIN 5 0 Flags Advertised window
RST
PSH Checksum Urgent pointer
URG Options (variable)

A tells B it’s likewise okay to start sending

… upon receiving this packet, B can start sending data


Transport Layer (contd.) 71
What if the SYN Packet Gets Lost?

v Suppose the SYN packet gets lost


§ Packet is lost inside the network, or:
§ Server discards the packet (e.g., it’s too busy)

v Eventually, no SYN-ACK arrives


§ Sender sets a timer and waits for the SYN-ACK
§ … and retransmits the SYN if needed

v How should the TCP sender set the timer?


§ Sender has no idea how far away the receiver is
§ Hard to guess a reasonable length of time to wait
§ SHOULD (RFCs 1122 & 2988) use default of 3 seconds
• Some implementations instead use 6 seconds

Transport Layer (contd.) 72


SYN Loss and Web Downloads
v User clicks on a hypertext link
§ Browser creates a socket and does a “connect”
§ The “connect” triggers the OS to transmit a SYN
v If the SYN is lost…
§ 3-6 seconds of delay: can be very long
§ User may become impatient
§ … and click the hyperlink again, or click “reload”
v User triggers an “abort” of the “connect”
§ Browser creates a new socket and another “connect”
§ Essentially, forces a faster send of a new SYN packet!
§ Sometimes very effective, and the page comes quickly

Transport Layer (contd.) 73


TCP: closing a connection
v client, server each close their side of connection
§ send TCP segment with FIN bit = 1
v respond to received FIN with ACK
§ on receiving FIN, ACK can be combined with own FIN
v simultaneous FIN exchanges can be handled

Transport Layer (contd.) 74


Normal Termination, One at a Time
client state server state
ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1
can still
FIN_WAIT_2 wait for server send data
close

LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime

CLOSED
TIMED_WAIT: Can retransmit ACK if ACK is lost
Transport Layer (contd.) 75
Normal Termination, Both Together
client state server state
ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1 FIN + ACK
wait for server together LAST_ACK
close FINbit=1, seq=y
TIMED_WAIT can no longer
send data

timed wait ACKbit=1; ACKnum=y+1


for 2*max
segment lifetime CLOSED

CLOSED

Transport Layer (contd.) 76


Abrupt Termination
B

SYN A

ACK

RST
ACK

RST
SYN

Data
Data
CK
A
time

v A sends a RESET (RST) to B


§ E.g., because application process on A crashed
v That’s it
§ B does not ack the RST
§ Thus, RST is not delivered reliably
§ And: any data in flight is lost
§ But: if B sends anything more, will elicit another RST
Transport Layer (contd.) 77
TCP Finite State Machine
CLOSED
Active open /SYN
Passive open Close
Close

LISTEN

SYN/SYN + ACK Send/SYN


SYN/SYN + ACK
SYN_RCVD SYN_SENT
ACK SYN + ACK/ACK
Data, ACK
exchanges
Close/FIN ESTABLISHED are in here

Close/FIN FIN/ACK
FIN_WAIT_1 CLOSE_WAIT
A FIN/ACK
CK
ACK + Close/FIN
FI
N/
FIN_WAIT_2 A CLOSING LAST_ACK
CK
ACK Timeout after two ACK
segment lifetimes
FIN/ACK
TIME_WAIT CLOSED

Transport Layer (contd.) 78


TCP Connection Management (cont)

TCP server
lifecycle

TCP client
lifecycle

Transport Layer (contd.) 79


TCP SYN Attack (SYN flooding)
v Miscreant creates a fake SYN packet
§ Destination is IP address of victim host (usually some server)
§ Source is some spoofed IP address
v Victim host on receiving creates a TCP connection state i.e allocates buffers,
creates variables, etc and sends SYN ACK to the spoofed address (half-open
connection)
v ACK never comes back
v After a timeout connection state is freed
v However for this duration the connection state is unnecessarily created
v Further miscreant sends large number of fake SYNs
§ Can easily overwhelm the victim
v Solutions:
§ Increase size of connection queue
§ Decrease timeout wait for the 3-way handshake
§ Firewalls: list of known bad source IP addresses
§ TCP SYN Cookies (explained on next slide)

Transport Layer (contd.) 80


TCP SYN Cookie
v On receipt of SYN, server does not create connection
state
v It creates an initial sequence number (init_seq) that is a
hash of source & dest IP address and port number of SYN
packet (secret key used for hash)
§ Replies back with SYN ACK containing init_seq
§ Server does not need to store this sequence number
v If original SYN is genuine, an ACK will come back
§ Same hash function run on the same header fields to get the initial
sequence number (init_seq)
§ Checks if the ACK is equal to (init_seq+1)
§ Only create connection state if above is true
v If fake SYN, no harm done since no state was created
http://etherealmind.com/tcp-syn-cookies-ddos-defence/
Transport Layer (contd.) 81
Taking Stock (1)
v The concepts underlying TCP are simple
§ acknowledgments (feedback)
§ timers
§ sliding windows
§ buffer management
§ sequence numbers

Transport Layer (contd.) 82


Taking Stock (2)
v The concepts underlying TCP are simple
v But tricky in the details
§ How do we set timers?
§ What is the seqno for an ACK-only packet?
§ What happens if advertised window = 0?
§ What if the advertised window is ½ an MSS?
§ Should receiver acknowledge packets right away?
§ What if the application generates data in units of 0.1
MSS?
§ What happens if I get a duplicate SYN? Or a RST while
I’m in FIN_WAIT, etc., etc., etc. 83
Transport Layer (contd.)
Transport: summary (so far)
v principles behind
transport layer services:
§ multiplexing,
demultiplexing next:
v leaving the
§ reliable data transfer
network
§ flow control “edge” (application
§ congestion control , transport layers)
(next week) v into the network
v instantiation, “core”
implementation in the
Internet
§ UDP
§ TCP 84
Transport Layer (contd.)

You might also like