0% found this document useful (0 votes)
182 views352 pages

Ilovepdf Merged-Compressed

Uploaded by

api-557267236
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
182 views352 pages

Ilovepdf Merged-Compressed

Uploaded by

api-557267236
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 352

Chapter 1

Introduction
A note on the use of these Powerpoint slides:
We’re making these slides freely available to all (faculty, students, readers).
They’re in PowerPoint form so you see the animations; and can add, modify,
and delete slides (including this one) and slide content to suit your needs.
They obviously represent a lot of work on our part. In return for use, we only
ask the following: Computer
▪ If you use these slides (e.g., in a class) that you mention their source
(after all, we’d like people to use our book!) Networking: A Top
▪ If you post any slides on a www site, that you note that they are adapted
from (or perhaps identical to) our slides, and note our copyright of this
material.
Down Approach
Thanks and enjoy! JFK/KWR 7th edition
Jim Kurose, Keith Ross
All material copyright 1996-2016
Pearson/Addison Wesley
J.F Kurose and K.W. Ross, All Rights Reserved
April 2016

© Department of Networked Systems and Services 1


Chapter 1: introduction
our goal: overview:
• get “feel” and • what’s the Internet?
terminology • what’s a protocol?
• more depth, • network edge; hosts, access net,
physical media
detail later in
• network core: packet/circuit
course switching, Internet structure
• approach: • performance: loss, delay,
– use Internet as throughput
example • security
• protocol layers, service models
• history
© Department of Networked Systems and Services 2
Chapter 1: roadmap
1.1 what is the Internet?
1.2 network edge
• end systems, access networks, links
1.3 network core
• packet switching, circuit switching, network structure
1.4 delay, loss, throughput in networks
1.5 protocol layers, service models
1.6 networks under attack: security
1.7 history

© Department of Networked Systems and Services 3


What’s the Internet: “nuts
and bolts” view
• billions of connected
PC mobile network
computing devices:
server

wireless
–hosts = end global ISP
laptop
smartphone
systems
–running network home
network
apps regional ISP
wireless
links ▪ communication links
wired
links • fiber, copper, radio,
satellite
• transmission rate:
bandwidth
router
▪ packet switches: forward institutional
packets (chunks of data) network

• routers and switches


© Department of Networked Systems and Services 4
“Fun” Internet-connected
devices

Web-enabled toaster +
weather forecaster

IP picture frame
http://www.ceiva.com/

Tweet-a-watt:
Slingbox: watch, monitor energy use
control cable TV remotely

sensorized,
bed
mattress
Internet
refrigerator Internet phones

© Department of Networked Systems and Services 5


What’s the Internet: “nuts and
bolts” view
mobile network
• Internet: “network of networks”
– Interconnected ISPs global ISP
• protocols control sending,
receiving of messages home
– e.g., TCP, IP, HTTP, Skype, network
regional ISP
802.11
• Internet standards
– RFC: Request for comments
– IETF: Internet Engineering Task
Force

institutional
network

© Department of Networked Systems and Services 6


What’s the Internet: a service
view
mobile network
• infrastructure that provides
services to applications: global ISP
– Web, VoIP, email, games, e-
commerce, social nets, …
• provides programming home
network
interface to apps regional ISP
– hooks that allow sending
and receiving app programs
to “connect” to Internet
– provides service options,
analogous to postal service

institutional
network

© Department of Networked Systems and Services 7


What’s a protocol?
human protocols: network protocols:
• “what’s the time?” ▪ machines rather than
• “I have a question” humans
• introductions ▪ all communication
activity in Internet
… specific messages governed by protocols
sent protocols define format, order of
… specific actions taken messages sent and received
when messages among network entities, and
received, or other actions taken on message
events
transmission, receipt
© Department of Networked Systems and Services 8
What’s a protocol?

a human protocol and a computer network


protocol:
Hi TCP connection
request
Hi TCP connection
response
Got the
time? Get http://www.awl.com/kurose-ross
2:00
<file>
time

Q: other human protocols?


© Department of Networked Systems and Services 9
Chapter 1: roadmap

1.1 what is the Internet?


1.2 network edge
• end systems, access networks, links
1.3 network core
• packet switching, circuit switching, network structure
1.4 delay, loss, throughput in networks
1.5 protocol layers, service models
1.6 networks under attack: security
1.7 history

© Department of Networked Systems and Services 10


A closer look at network
structure:
▪ network edge: mobile network

• hosts: clients and global ISP


servers
• servers often in data home
network
centers regional ISP

▪ access networks, physical


media: wired, wireless
communication links
▪ network core:
• interconnected routers institutional
• network of networks network

© Department of Networked Systems and Services 11


Access networks and
physical media
Q: How to connect
end systems to
edge router?
▪ residential access nets
▪ institutional access
networks (school, company)
▪ mobile access networks

keep in mind:
▪ bandwidth (bits per second)
of access network?
▪ shared or dedicated?

© Department of Networked Systems and Services 12


Access network: digital subscriber
line (DSL)
central office telephone
network

DSL splitter
modem DSLAM

ISP
voice, data transmitted
at different frequencies over DSL access
dedicated line to central office multiplexer

▪ use existing telephone line to central office DSLAM


• data over DSL phone line goes to Internet
• voice over DSL phone line goes to telephone net
▪ < 2.5 Mbps upstream transmission rate (typically < 1 Mbps)
▪ < 24 Mbps downstream transmission rate (typically < 10 Mbps)
© Department of Networked Systems and Services 13
Access network: cable network
cable headend

cable splitter
modem

C
O
V V V V V V N
I I I I I I D D T
D D D D D D A A R
E E E E E E T T O
O O O O O O A A L

1 2 3 4 5 6 7 8 9

Channels

frequency division multiplexing: different channels transmitted


in different frequency bands

© Department of Networked Systems and Services 14


Access network: cable network
cable headend

cable splitter cable modem


modem CMTS termination system

data, TV transmitted at different


frequencies over shared cable ISP
distribution network

▪ HFC: hybrid fiber coax


• asymmetric: up to 30Mbps downstream transmission rate, 2
Mbps upstream transmission rate
▪ network of cable, fiber attaches homes to ISP router
• homes share access network to cable headend
• unlike DSL, which has dedicated access to central office
© Department of Networked Systems and Services 15
Access network: home network

wireless
devices

to/from headend or
central office
often combined
in single box

cable or DSL modem

wireless access router, firewall, NAT


point (54 Mbps)
wired Ethernet (1 Gbps)

© Department of Networked Systems and Services 16


Enterprise access networks
(Ethernet)

institutional link to
ISP (Internet)
institutional router

Ethernet institutional mail,


switch web servers

▪ typically used in companies, universities, etc.


▪ 10 Mbps, 100Mbps, 1Gbps, 10Gbps transmission rates
▪ today, end systems typically connect into Ethernet
switch
© Department of Networked Systems and Services 17
Wireless access networks

• shared wireless access network connects end system to router


– via base station aka “access point”

wireless LANs: wide-area wireless access


▪ within building (100 ft.) ▪ provided by telco (cellular)
▪ 802.11b/g/n (WiFi): 11, 54, 450 operator, 10’s km
Mbps transmission rate ▪ between 1 and 10 Mbps
▪ 3G, 4G: LTE

to Internet

to Internet

© Department of Networked Systems and Services 18


Host: sends packets of data
host sending function:
▪ takes application message
▪ breaks into smaller chunks, two packets,
known as packets, of length L bits each
L bits
▪ transmits packet into access
network at transmission rate
R 2 1

• link transmission R: link transmission rate


rate, aka link host
capacity, aka link
bandwidth

packet time needed to L (bits)


transmission = transmit L-bit =
delay packet into link R (bits/sec)

© Department of Networked Systems and Services 19


Physical media

• bit: propagates between


transmitter/receiver pairs
twisted pair (TP)
• physical link: what lies
between transmitter & ▪ two insulated copper
receiver wires
• Category 5: 100 Mbps, 1
• guided media: Gbps Ethernet
–signals propagate in • Category 6: 10Gbps
solid media: copper,
fiber, coax
• unguided media:
–signals propagate
freely, e.g., radio
© Department of Networked Systems and Services 20
Physical media: coax, fiber

coaxial cable: fiber optic cable:


▪ glass fiber carrying light
▪ two concentric copper pulses, each pulse a bit
conductors
▪ high-speed operation:
▪ bidirectional • high-speed point-to-point
transmission (e.g., 10’s-100’s
▪ broadband: Gbps transmission rate)
• multiple channels on cable ▪ low error rate:
• HFC • repeaters spaced far apart
• immune to electromagnetic
noise

© Department of Networked Systems and Services 21


Physical media: radio

• signal carried in radio link types:


electromagnetic ▪ terrestrial microwave
spectrum • e.g. up to 45 Mbps channels
▪ LAN (e.g., WiFi)
• no physical “wire”
• 54 Mbps
• bidirectional ▪ wide-area (e.g., cellular)
• propagation environment • 4G cellular: ~ 10 Mbps
effects: ▪ satellite
• Kbps to 45Mbps channel (or
–reflection multiple smaller channels)
–obstruction by • 270 msec end-end delay
• geosynchronous versus low
objects altitude
–interference
© Department of Networked Systems and Services 22
Chapter 1: roadmap
1.1 what is the Internet?
1.2 network edge
• end systems, access networks, links
1.3 network core
• packet switching, circuit switching, network structure
1.4 delay, loss, throughput in networks
1.5 protocol layers, service models
1.6 networks under attack: security
1.7 history

© Department of Networked Systems and Services 23


The network core

• mesh of interconnected
routers
• packet-switching: hosts
break application-layer
messages into packets
– forward packets from
one router to the next,
across links on path
from source to
destination
– each packet transmitted
at full link capacity

© Department of Networked Systems and Services 24


Packet-switching: store-and-
forward

L bits
per packet

3 2 1
source destination
R bps R bps

• takes L/R seconds to


transmit (push out) L-bit one-hop numerical
packet into link at R bps example:
• store and forward: entire ▪ L = 7.5 Mbits
packet must arrive at router
before it can be transmitted ▪ R = 1.5 Mbps
on next link ▪ one-hop transmission
delay = 5 sec
▪ end-end delay = 2L/R (assuming
zero propagation delay) more on delay shortly …
© Department of Networked Systems and Services 25
Packet Switching: queueing
delay, loss
R = 100 Mb/s C
A
D
R = 1.5 Mb/s
B
queue of packets E
waiting for output link

queuing and loss:


▪ if arrival rate (in bits) to link exceeds transmission rate of link
for a period of time:
• packets will queue, wait to be transmitted on link
• packets can be dropped (lost) if memory (buffer) fills up

© Department of Networked Systems and Services 26


Two key network-core functions

routing: determines source-


destination route taken by forwarding: move packets
packets
▪ routing algorithms from router’s input to
appropriate router output

routing algorithm

local forwarding table


header value output link
0100 3 1
0101 2
0111 2 3 2
1001 1

destination address in arriving


packet’s header
© Department of Networked Systems and Services 27
Alternative core: circuit
switching
end-end resources
allocated to, reserved for
“call” between source &
dest:
• in diagram, each link has four
circuits.
– call gets 2nd circuit in top
link and 1st circuit in right
link.
• dedicated resources: no sharing
– circuit-like (guaranteed)
performance
• circuit segment idle if not used by
call (no sharing)
• commonly used in traditional
telephone networks

© Department of Networked Systems and Services 28


Circuit switching: FDM versus
TDM
Example:
FDM
4 users

frequency

time
TDM

frequency

time

© Department of Networked Systems and Services 29


Packet switching versus circuit
switching
packet switching allows more users to use network!

example:
• 1 Mb/s link
• each user: N
users
• 100 kb/s when “active”
1 Mbps link
• active 10% of time

• circuit-switching:
– 10 users
• packet switching: Q: how did we get value 0.0004?
– with 35 users, probability > Q: what happens if > 35 users ?
10 active at same time is
less than .0004 *
* Check out the online interactive exercises for more examples: http://gaia.cs.umass.edu/kurose_ross/interactive/

© Department of Networked Systems and Services 30


Packet switching versus circuit
switching
is packet switching a “slam dunk winner?”
• great for bursty data
–resource sharing
–simpler, no call setup
• excessive congestion possible: packet delay and loss
–protocols needed for reliable data transfer,
congestion control
• Q: How to provide circuit-like behavior?
–bandwidth guarantees needed for audio/video
apps
–still an unsolved problem (chapter 7)
Q: human analogies of reserved resources (circuit switching)
versus on-demand allocation (packet-switching)?
© Department of Networked Systems and Services 31
Internet structure: network of
networks
▪ End systems connect to Internet via access ISPs (Internet
Service Providers)
• residential, company and university ISPs
▪ Access ISPs in turn must be interconnected.
• so that any two hosts can send packets to each other
▪ Resulting network of networks is very complex
• evolution was driven by economics and national policies
▪ Let’s take a stepwise approach to describe current Internet
structure

© Department of Networked Systems and Services 32


Internet structure: network of
networks
Question: given millions of access ISPs, how to connect
them together?
access access
net net
access
net
access
access net
net
access
access net
net

access access
net net

access
net
access
net

access
net
access
net
access access
net access net
net

© Department of Networked Systems and Services 33


Internet structure: network of
networks
Option: connect each access ISP to every other access
ISP?
access access
net net
access
net
access
access net
net
access
access net
net

connecting each access ISP


access
to each other directly doesn’t access
net
scale: O(N2) connections. net

access
net
access
net

access
net
access
net
access access
net access net
net

© Department of Networked Systems and Services 34


Internet structure: network of
networks
Option: connect each access ISP to one global transit ISP?
Customer and provider ISPs have economic agreement.
access access
net net
access
net
access
access net
net
access
access net
net

global
access
net
ISP access
net

access
net
access
net

access
net
access
net
access access
net access net
net

© Department of Networked Systems and Services 35


Internet structure: network of
networks
But if one global ISP is viable business, there will be competitors
….
access access
net net
access
net
access
access net
net
access
access net
net
ISP A

access
net ISP B access
net

access
net
ISP C
access
net

access
net
access
net
access access
net access net
net

© Department of Networked Systems and Services 36


Internet structure: network of
networks
But if one global ISP is viable business, there will be competitors
…. which must be interconnected
access access
Internet exchange point
net net
access
net
access
access net
net

access
IXP access
net
net
ISP A

access
net
IXP ISP B access
net

access
net
ISP C
access
net

access peering link


net
access
net
access access
net access net
net

© Department of Networked Systems and Services 37


Internet structure: network of
networks
… and regional networks may arise to connect access nets to
ISPs
access access
net net
access
net
access
access net
net

access
IXP access
net
net
ISP A

access
net
IXP ISP B access
net

access
net
ISP C
access
net

access
net regional net
access
net
access access
net access net
net

© Department of Networked Systems and Services 38


Internet structure: network of
networks
… and content provider networks (e.g., Google, Microsoft,
Akamai) may run their own network, to bring services, content
close to end users
access access
net net
access
net
access
access net
net

access
IXP access
net
net
ISP A
Content provider network
access
net
IXP ISP B access
net

access
net
ISP C
access
net

access
net regional net
access
net
access access
net access net
net

© Department of Networked Systems and Services 39


Internet structure: network of
networks
Tier 1 ISP Tier 1 ISP Google

IX IX IX
P P P
Regional ISP Regional ISP

access access access access access access access access


ISP ISP ISP ISP ISP ISP ISP ISP

• at center: small # of well-connected large networks


– “tier-1” commercial ISPs (e.g., Level 3, Sprint, AT&T, NTT), national &
international coverage
– content provider network (e.g., Google): private network that connects it data
centers to Internet, often bypassing tier-1, regional ISPs
© Department of Networked Systems and Services 40
Tier-1 ISP: e.g., Sprint

POP: point-of-presence
to/from backbone

peering
… … …

to/from customers

© Department of Networked Systems and Services 41


Chapter 1: roadmap
1.1 what is the Internet?
1.2 network edge
• end systems, access networks, links
1.3 network core
• packet switching, circuit switching, network structure
1.4 delay, loss, throughput in networks
1.5 protocol layers, service models
1.6 networks under attack: security
1.7 history

© Department of Networked Systems and Services 42


How do loss and delay occur?

packets queue in router buffers


▪ packet arrival rate to link (temporarily) exceeds output link
capacity
▪ packets queue, wait for turn
packet being transmitted (delay)

B
packets queueing (delay)
free (available) buffers: arriving packets
dropped (loss) if no free buffers

© Department of Networked Systems and Services 43


Four sources of packet delay
transmission
A propagation

B
nodal
processing queueing

dnodal = dproc + dqueue + dtrans + dprop

dproc: nodal processing dqueue: queueing delay


▪ check bit errors ▪ time waiting at output link
▪ determine output link for transmission
▪ depends on congestion
▪ typically < msec
level of router
© Department of Networked Systems and Services 44
Four sources of packet delay
transmission
A propagation

B
nodal
processing queueing

dnodal = dproc + dqueue + dtrans + dprop

dtrans: transmission delay: dprop: propagation delay:


▪ L: packet length (bits) ▪ d: length of physical link
▪ R: link bandwidth (bps) ▪ s: propagation speed (~2x108 m/sec)
▪ dtrans = L/R dtrans and dprop ▪ dprop = d/s
very different
* Check out the online interactive exercises for more examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
* Check out the Java applet for an interactive animation on trans vs. prop delay
© Department of Networked Systems and Services 45
Caravan analogy
100 km 100 km
ten-car toll toll
caravan booth booth

• cars “propagate” at • time to “push” entire


100 km/hr caravan through toll
• toll booth takes 12 sec to booth onto highway =
service car (bit 12*10 = 120 sec
transmission time) • time for last car to
• car ~ bit; caravan ~ packet propagate from 1st to
• Q: How long until caravan 2nd toll both:
is lined up before 2nd toll 100km/(100km/hr)= 1 hr
booth? • A: 62 minutes

© Department of Networked Systems and Services 46


Caravan analogy (more)
100 km 100 km
ten-car toll toll
caravan booth booth

• suppose cars now “propagate” at 1000 km/hr


• and suppose toll booth now takes one min to service a car
• Q: Will cars arrive to 2nd booth before all cars serviced at
first booth?
• A: Yes! after 7 min, first car arrives at second booth;
three cars still at first booth

© Department of Networked Systems and Services 47


Queueing delay (revisited)

average queueing
• R: link bandwidth (bps)

delay
• L: packet length (bits)
• a: average packet arrival
rate
traffic intensity
= La/R
▪ La/R ~ 0: avg. queueing delay small La/R ~ 0

▪ La/R -> 1: avg. queueing delay large


▪ La/R > 1: more “work” arriving
than can be serviced, average delay infinite!

* Check online interactive animation on queuing and loss La/R -> 1


© Department of Networked Systems and Services 48
“Real” Internet delays and routes

• what do “real” Internet delay & loss look like?


• traceroute program: provides delay
measurement from source to router along end-
end Internet path towards destination. For all i:
– sends three packets that will reach router i on path
towards destination
– router i will return packets to sender
– sender times interval between transmission and
reply.

3 probes 3 probes

3 probes

© Department of Networked Systems and Services 49


“Real” Internet delays, routes

traceroute: gaia.cs.umass.edu to www.eurecom.fr


3 delay measurements from
gaia.cs.umass.edu to cs-gw.cs.umass.edu
1 cs-gw (128.119.240.254) 1 ms 1 ms 2 ms
2 border1-rt-fa5-1-0.gw.umass.edu (128.119.3.145) 1 ms 1 ms 2 ms
3 cht-vbns.gw.umass.edu (128.119.3.130) 6 ms 5 ms 5 ms
4 jn1-at1-0-0-19.wor.vbns.net (204.147.132.129) 16 ms 11 ms 13 ms
5 jn1-so7-0-0-0.wae.vbns.net (204.147.136.136) 21 ms 18 ms 18 ms
6 abilene-vbns.abilene.ucaid.edu (198.32.11.9) 22 ms 18 ms 22 ms
7 nycm-wash.abilene.ucaid.edu (198.32.8.46) 22 ms 22 ms 22 ms trans-oceanic
8 62.40.103.253 (62.40.103.253) 104 ms 109 ms 106 ms
9 de2-1.de1.de.geant.net (62.40.96.129) 109 ms 102 ms 104 ms link
10 de.fr1.fr.geant.net (62.40.96.50) 113 ms 121 ms 114 ms
11 renater-gw.fr1.fr.geant.net (62.40.103.54) 112 ms 114 ms 112 ms
12 nio-n2.cssi.renater.fr (193.51.206.13) 111 ms 114 ms 116 ms
13 nice.cssi.renater.fr (195.220.98.102) 123 ms 125 ms 124 ms
14 r3t2-nice.cssi.renater.fr (195.220.98.110) 126 ms 126 ms 124 ms
15 eurecom-valbonne.r3t2.ft.net (193.48.50.54) 135 ms 128 ms 133 ms
16 194.214.211.25 (194.214.211.25) 126 ms 128 ms 126 ms
17 * * *
18 * * * * means no response (probe lost, router not replying)
19 fantasia.eurecom.fr (193.55.113.142) 132 ms 128 ms 136 ms
* Do some traceroutes from exotic countries at www.traceroute.org
© Department of Networked Systems and Services 50
Packet loss
• queue (aka buffer) preceding link in buffer has
finite capacity
• packet arriving to full queue dropped (aka lost)
• lost packet may be retransmitted by previous
node, by source end system, or not at all
buffer
(waiting area) packet being transmitted
A

B
packet arriving to
full buffer is lost
* Check out the Java applet for an interactive animation on queuing and loss
© Department of Networked Systems and Services 51
Throughput

• throughput: rate (bits/time unit) at which


bits transferred between sender/receiver
–instantaneous: rate at given point in time
–average: rate over longer period of time

server,
server withbits
sends linkpipe
capacity
that can carry linkpipe
capacity
that can carry
file of into
(fluid) F bitspipe Rs bits/sec
fluid at rate Rc bits/sec
fluid at rate
to send to client Rs bits/sec) Rc bits/sec)

© Department of Networked Systems and Services 52


Throughput (more)

• Rs < Rc What is average end-end


throughput?
Rs bits/sec Rc bits/sec

▪ Rs > Rc What is average end-end throughput?

Rs bits/sec Rc bits/sec

bottleneck link
link on end-end path that constrains end-end throughput

© Department of Networked Systems and Services 53


Throughput: Internet scenario

• per-connection
Rs
end-end
Rs Rs
throughput:
min(Rc,Rs,R/10) R
• in practice: Rc or
Rc Rc
Rs is often
Rc
bottleneck

10 connections (fairly) share


* Check out the online interactive exercises for more
examples: http://gaia.cs.umass.edu/kurose_ross/interactive/ backbone bottleneck link R bits/sec

© Department of Networked Systems and Services 54


Chapter 1: roadmap
1.1 what is the Internet?
1.2 network edge
• end systems, access networks, links
1.3 network core
• packet switching, circuit switching, network structure
1.4 delay, loss, throughput in networks
1.5 protocol layers, service models
1.6 networks under attack: security
1.7 history

© Department of Networked Systems and Services 55


Protocol “layers”
Networks are complex,
with many “pieces”:
▪ hosts Question:
is there any hope of
▪ routers organizing structure of
▪ links of various media network?
▪ applications
…. or at least our
▪ protocols discussion of
▪ hardware, software networks?

© Department of Networked Systems and Services 56


Organization of air travel

ticket (purchase) ticket (complain)

baggage (check) baggage (claim)

gates (load) gates (unload)

runway takeoff runway landing

airplane routing airplane routing


airplane routing

• a series of steps

© Department of Networked Systems and Services 57


Layering of airline functionality

ticket (purchase) ticket (complain) ticket

baggage (check) baggage (claim baggage

gates (load) gates (unload) gate

runway (takeoff) runway (land) takeoff/landing

airplane routing airplane routing airplane routing airplane routing airplane routing

departure intermediate air-traffic arrival


airport control centers airport

layers: each layer implements a service


▪ via its own internal-layer actions
▪ relying on services provided by layer
below
© Department of Networked Systems and Services 58
Why layering?
dealing with complex systems:
• explicit structure allows identification,
relationship of complex system’s pieces
– layered reference model for discussion
• modularization eases maintenance,
updating of system
– change of implementation of layer’s service
transparent to rest of system
– e.g., change in gate procedure doesn’t affect
rest of system
• layering considered harmful?
© Department of Networked Systems and Services 59
Internet protocol stack
• application: supporting
network applications
– FTP, SMTP, HTTP application
• transport: process-process
data transfer transport
– TCP, UDP
• network: routing of datagrams
from source to destination network
– IP, routing protocols
• link: data transfer between link
neighboring network
elements physical
– Ethernet, 802.111 (WiFi), PPP
• physical: bits “on the wire”

© Department of Networked Systems and Services 60


ISO/OSI reference model

• presentation: allow
applications to interpret application
meaning of data, e.g.,
encryption, compression, presentation
machine-specific conventions
session
• session: synchronization,
checkpointing, recovery of transport
data exchange network
• Internet stack “missing” these
layers! link
– these services, if needed, must physical
be implemented in application
– needed?

© Department of Networked Systems and Services 61


source Encapsulation
message M application
segment Ht M transport
datagram Hn Ht M network
frame Hl Hn Ht M link
physical
link
physical

switch

destination Hn Ht M network
M application Hl Hn Ht M link Hn Ht M
Ht M transport physical
Hn Ht M network
Hl Hn Ht M link router
physical

© Department of Networked Systems and Services 62


Chapter 1: roadmap
1.1 what is the Internet?
1.2 network edge
• end systems, access networks, links
1.3 network core
• packet switching, circuit switching, network structure
1.4 delay, loss, throughput in networks
1.5 protocol layers, service models
1.6 networks under attack: security
1.7 history

© Department of Networked Systems and Services 63


Network security
• field of network security:
– how bad guys can attack computer networks
– how we can defend networks against attacks
– how to design architectures that are immune to
attacks
• Internet not originally designed with (much)
security in mind
– original vision: “a group of mutually trusting
users attached to a transparent network” ☺
– Internet protocol designers playing “catch-up”
– security considerations in all layers!

© Department of Networked Systems and Services 64


Bad guys: put malware into
hosts via Internet
• malware can get in host from:
– virus: self-replicating infection by
receiving/executing object (e.g., e-mail
attachment)
– worm: self-replicating infection by passively
receiving object that gets itself executed
• spyware malware can record keystrokes,
web sites visited, upload info to collection
site
• infected host can be enrolled in botnet,
used for spam. DDoS attacks

© Department of Networked Systems and Services 65


Bad guys: attack server, network
infrastructure
Denial of Service (DoS): attackers make resources
(server, bandwidth) unavailable to legitimate
traffic by overwhelming resource with bogus traffic

1. select target
2. break into hosts around
the network (see botnet)
3. send packets to target from
compromised hosts
target

© Department of Networked Systems and Services 66


Bad guys can sniff packets

packet “sniffing”:
▪ broadcast media (shared Ethernet, wireless)
▪ promiscuous network interface reads/records
all packets (e.g., including passwords!)
passing
A by C

src:B dest:A payload


B

▪ wireshark software used for end-of-chapter labs is a


(free) packet-sniffer
© Department of Networked Systems and Services 67
Bad guys can use fake
addresses

IP spoofing: send packet with false source


address C
A

src:B dest:A payload

… lots more on security (throughout, Chapter 8)

© Department of Networked Systems and Services 68


Chapter 1: roadmap
1.1 what is the Internet?
1.2 network edge
• end systems, access networks, links
1.3 network core
• packet switching, circuit switching, network structure
1.4 delay, loss, throughput in networks
1.5 protocol layers, service models
1.6 networks under attack: security
1.7 history

© Department of Networked Systems and Services 69


Internet history
1961-1972: Early packet-switching principles
▪ 1961: Kleinrock - • 1972:
queueing theory shows – ARPAnet public demo
effectiveness of packet- – NCP (Network Control
switching Protocol) first host-host
protocol
▪ 1964: Baran - packet-
– first e-mail program
switching in military nets
– ARPAnet has 15 nodes
▪ 1967: ARPAnet
conceived by Advanced
Research Projects
Agency
▪ 1969: first ARPAnet node
operational

© Department of Networked Systems and Services 70


Internet history
1972-1980: Internetworking, new and proprietary nets

• 1970: ALOHAnet satellite


network in Hawaii Cerf and Kahn’s
• 1974: Cerf and Kahn - internetworking principles:
architecture for interconnecting – minimalism, autonomy - no
internal changes required
networks to interconnect networks
• 1976: Ethernet at Xerox PARC – best effort service model
• late70’s: proprietary – stateless routers
architectures: DECnet, SNA, – decentralized control
XNA define today’s Internet
architecture
• late 70’s: switching fixed length
packets (ATM precursor)
• 1979: ARPAnet has 200 nodes

© Department of Networked Systems and Services 71


Internet history
1980-1990: new protocols, a proliferation of networks

• 1983: deployment of ▪ new national networks:


TCP/IP CSnet, BITnet, NSFnet,
• 1982: smtp e-mail Minitel
protocol defined ▪ 100,000 hosts
• 1983: DNS defined for connected to
name-to-IP-address confederation of
translation networks
• 1985: ftp protocol
defined
• 1988: TCP congestion
control

© Department of Networked Systems and Services 72


Internet history
1990, 2000’s: commercialization, the Web, new apps
• early 1990’s: ARPAnet late 1990’s – 2000’s:
decommissioned
• 1991: NSF lifts restrictions on • more killer apps: instant
commercial use of NSFnet messaging, P2P file
(decommissioned, 1995) sharing
• early 1990s: Web
• network security to
– hypertext [Bush 1945,
Nelson 1960’s] forefront
– HTML, HTTP: Berners-Lee • est. 50 million host, 100
– 1994: Mosaic, later million+ users
Netscape • backbone links running
– late 1990’s: at Gbps
commercialization of the Web

© Department of Networked Systems and Services 73


Internet history
2005-present
• ~5B devices attached to Internet (2016)
– smartphones and tablets
• aggressive deployment of broadband access
• increasing ubiquity of high-speed wireless access
• emergence of online social networks:
– Facebook: ~ one billion users
• service providers (Google, Microsoft) create their own
networks
– bypass Internet, providing “instantaneous”
access to search, video content, email, etc.
• e-commerce, universities, enterprises running their
services in “cloud” (e.g., Amazon EC2)
© Department of Networked Systems and Services 74
Introduction: summary
covered a “ton” of you now have:
material! • context, overview,
• Internet overview “feel” of networking
• what’s a protocol? • more depth, detail to
• network edge, core, access follow!
network
–packet-switching
versus circuit-
switching
–Internet structure
• performance: loss, delay,
throughput
• layering, service models
• security
• history
© Department of Networked Systems and Services 75
Chapter 1
Additional Slides

© Department of Networked Systems and Services 76


Packet analyzing basics

application
(www browser,
packet
email client)
analyzer
application

OS
packet Transport (TCP/UDP)
Network (IP)
capture copy of all
Ethernet Link (Ethernet)
(pcap) frames
sent/receive Physical
d

© Department of Networked Systems and Services 77


Chapter 2
Application Layer
A note on the use of these Powerpoint slides:
We’re making these slides freely available to all (faculty, students, readers).
They’re in PowerPoint form so you see the animations; and can add, modify,
and delete slides (including this one) and slide content to suit your needs.
They obviously represent a lot of work on our part. In return for use, we only
ask the following: Computer
▪ If you use these slides (e.g., in a class) that you mention their source
(after all, we’d like people to use our book!) Networking: A Top
▪ If you post any slides on a www site, that you note that they are adapted
from (or perhaps identical to) our slides, and note our copyright of this Down Approach
material.
7th edition
Thanks and enjoy! JFK/KWR Jim Kurose, Keith Ross
Pearson/Addison Wesley
All material copyright 1996-2016
April 2016
J.F Kurose and K.W. Ross, All Rights Reserved
© Department of Networked Systems and Services 1
Chapter 2: outline

2.1 principles of network 2.5 P2P applications


applications 2.6 video streaming
2.2 Web and HTTP and content
2.3 electronic mail distribution
• SMTP, POP3, IMAP networks
2.4 DNS

© Department of Networked Systems and Services 2


Chapter 2: application layer

our goals: • learn about protocols


• conceptual, by examining popular
implementation application-level
aspects of network protocols
application protocols – HTTP
– transport-layer – FTP
service models – SMTP / POP3 / IMAP
– DNS
– client-server
paradigm
– peer-to-peer
paradigm
– content distribution
networks
© Department of Networked Systems and Services 3
Some network apps

• e-mail • voice over IP (e.g.,


• web Skype)
• text messaging • real-time video
• remote login conferencing
• P2P file sharing • social networking
• multi-user network • search
games • …
• streaming stored • …
video (YouTube,
Hulu, Netflix)
© Department of Networked Systems and Services 4
Creating a network app
application
transport
network
data link
physical
write programs that:
• run on (different) end systems
• communicate over network
• e.g., web server software
communicates with browser
software

no need to write application


transport
software for network- network
data link application

core devices physical transport


network
data link
• network-core devices do not physical

run user applications


• applications on end systems
allows for rapid app
development, propagation

© Department of Networked Systems and Services 5


Application architectures
possible structure of applications:
• client-server
• peer-to-peer (P2P)

© Department of Networked Systems and Services 6


Client-server architecture
server:
• always-on host
• permanent IP address
• data centers for scaling

clients:
• communicate with server
• may be intermittently
client/server connected
• may have dynamic IP
addresses
• do not communicate directly
with each other

© Department of Networked Systems and Services 7


P2P architecture
• no always-on server peer-peer
• arbitrary end systems
directly communicate
• peers request service from
other peers, provide service
in return to other peers
– self scalability – new
peers bring new
service capacity, as
well as new service
demands
• peers are intermittently
connected and change IP
addresses
– complex
management
© Department of Networked Systems and Services 8
Processes communicating

process: program clients, servers


running within a host client process: process
• within same host, two that initiates
communication
processes communicate
using inter-process server process: process
communication (defined that waits to be contacted
by OS)
• processes in different ▪ aside: applications with P2P
hosts communicate by architectures have client
exchanging messages processes & server
processes

© Department of Networked Systems and Services 9


Sockets
• process sends/receives messages to/from its socket
• socket analogous to door
– sending process shoves message out door
– sending process relies on transport
infrastructure on other side of door to deliver
message to socket at receiving process
application
application
socket controlled by
process process app developer

transport transport
network network controlled
link by OS
link Internet
physical physical

© Department of Networked Systems and Services 10


Addressing processes

• to receive messages, • identifier includes both IP


process must have address and port numbers
identifier associated with process
• host device has unique on host.
32-bit IP address • example port numbers:
• Q: does IP address of – HTTP server: 80
host on which process – mail server: 25
runs suffice for identifying • to send HTTP message to
the process? gaia.cs.umass.edu web
server:
▪ A: no, many processes – IP address: 128.119.245.12
can be running on same – port number: 80
host
• more shortly…
© Department of Networked Systems and Services 11
App-layer protocol defines
• types of messages open protocols:
exchanged,
• defined in RFCs
– e.g., request, response
• message syntax: • allows for
– what fields in messages interoperability
& how fields are • e.g., HTTP, SMTP
delineated
proprietary protocols:
• message semantics
– meaning of information • e.g., Skype
in fields
• rules for when and how
processes send & respond
to messages

© Department of Networked Systems and Services 12


What transport service does an
app need?
data integrity throughput
▪ some apps (e.g., file transfer, ▪ some apps (e.g.,
web transactions) require multimedia) require
100% reliable data transfer minimum amount of
▪ other apps (e.g., audio) can throughput to be
tolerate some loss “effective”
▪ other apps (“elastic apps”)
timing make use of whatever
• some apps (e.g., throughput they get
Internet telephony, security
interactive games) ▪ encryption, data integrity,
require low delay to be …
“effective”

© Department of Networked Systems and Services 13


Transport service requirements:
common apps

application data loss throughput time sensitive

file transfer no loss elastic no


e-mail no loss elastic no
Web documents no loss elastic no
real-time audio/video loss-tolerant audio: 5kbps-1Mbps yes, 100’s
video:10kbps-5Mbps msec
stored audio/video loss-tolerant same as above
interactive games loss-tolerant few kbps up yes, few secs
text messaging no loss elastic yes, 100’s
msec
yes and no

© Department of Networked Systems and Services 14


Internet transport protocols
services
TCP service:
• reliable transport between UDP service:
sending and receiving ▪ unreliable data transfer
process between sending and
• flow control: sender won’t receiving process
overwhelm receiver
• congestion control: throttle ▪ does not provide: reliability,
sender when network flow control, congestion
overloaded control, timing, throughput
• does not provide: timing, guarantee, security, or
minimum throughput connection setup,
guarantee, security
• connection-oriented: setup
required between client and Q: why bother? Why is there
server processes a UDP?

© Department of Networked Systems and Services 15


Internet apps: application,
transport protocols
application underlying
application layer protocol transport protocol

e-mail SMTP [RFC 2821] TCP


remote terminal access Telnet [RFC 854] TCP
Web HTTP [RFC 2616] TCP
file transfer FTP [RFC 959] TCP
streaming multimedia HTTP (e.g., YouTube), TCP or UDP
RTP [RFC 1889]
Internet telephony SIP, RTP, proprietary
(e.g., Skype) TCP or UDP

© Department of Networked Systems and Services 16


Securing TCP

TCP & UDP SSL is at app layer


▪ no encryption • apps use SSL libraries,
▪ cleartext passwds sent that “talk” to TCP
into socket traverse SSL socket API
Internet in cleartext
▪ cleartext passwords sent
SSL into socket traverse
▪ provides encrypted TCP Internet encrypted
connection
▪ data integrity
▪ end-point authentication

© Department of Networked Systems and Services 17


Chapter 2: outline

2.1 principles of network 2.5 P2P applications


applications 2.6 video streaming
2.2 Web and HTTP and content
2.3 electronic mail distribution
• SMTP, POP3, IMAP networks
2.4 DNS

© Department of Networked Systems and Services 18


Web and HTTP
First, a review…
• web page consists of objects
• object can be HTML file, JPEG image,
Java applet, audio file,…
• web page consists of base HTML-file
which includes several referenced
objects
• each object is addressable by a URL,
e.g., www.someschool.edu/someDept/pic.gif

host name path name


© Department of Networked Systems and Services 19
HTTP overview

HTTP: hypertext
transfer protocol
• Web’s application layer
protocol PC running
Firefox browser
• client/server model
– client: browser that
requests, receives,
(using HTTP server
protocol) and running
“displays” Web Apache Web
objects server
– server: Web server
sends (using HTTP iPhone running
protocol) objects in Safari browser
response to requests

© Department of Networked Systems and Services 20


HTTP overview (continued)

uses TCP: HTTP is


▪ client initiates TCP connection “stateless”
(creates socket) to server, • server maintains no
information about
port 80 past client requests
▪ server accepts TCP connection aside
from client
protocols that maintain
▪ HTTP messages (application- “state” are complex!
layer protocol messages) ▪ past history (state) must be
exchanged between browser maintained
(HTTP client) and Web server ▪ if server/client crashes, their
(HTTP server) views of “state” may be
inconsistent, must be
▪ TCP connection closed
reconciled

© Department of Networked Systems and Services 21


HTTP connections

non-persistent HTTP persistent HTTP


• at most one object • multiple objects can
sent over TCP be sent over single
connection TCP connection
– connection then between client,
closed server
• downloading
multiple objects
required multiple
connections
© Department of Networked Systems and Services 22
Non-persistent HTTP
suppose user enters URL: (contains text,
www.someSchool.edu/someDepartment/home.index references to 10
jpeg images)

1a. HTTP client initiates TCP


connection to HTTP server 1b. HTTP server at host
(process) at www.someSchool.edu waiting
www.someSchool.edu on port for TCP connection at port 80.
80 “accepts” connection, notifying
2. HTTP client sends HTTP request client
message (containing URL) into
TCP connection socket. 3. HTTP server receives request
Message indicates that client message, forms response
wants object message containing requested
someDepartment/home.index object, and sends message into
its socket
time

© Department of Networked Systems and Services 23


Non-persistent HTTP (cont.)
4. HTTP server closes TCP
connection.
5. HTTP client receives response
message containing html file,
displays html. Parsing html file,
finds 10 referenced jpeg
objects
time
6. Steps 1-5 repeated for each of
10 jpeg objects

© Department of Networked Systems and Services 24


Non-persistent HTTP: response time
RTT (definition): time for a
small packet to travel from
client to server and back
HTTP response time: initiate TCP
• one RTT to initiate TCP connection
connection RTT
• one RTT for HTTP request request
file
and first few bytes of HTTP time to
response to return RTT transmit
file
• file transmission time file
• non-persistent HTTP received
response time =
2RTT+ file time time

transmission time

© Department of Networked Systems and Services 25


Persistent HTTP

non-persistent HTTP issues: persistent HTTP:


▪ requires 2 RTTs per object ▪ server leaves connection
open after sending
▪ OS overhead for each TCP
response
connection
▪ subsequent HTTP
▪ browsers often open messages between same
parallel TCP connections to client/server sent over
fetch referenced objects open connection
▪ client sends requests as
soon as it encounters a
referenced object
▪ as little as one RTT for all
the referenced objects

© Department of Networked Systems and Services 26


HTTP request message
• two types of HTTP messages: request, response
• HTTP request message:
– ASCII (human-readable format) carriage return character
line-feed character
request line
(GET, POST, GET /index.html HTTP/1.1\r\n
HEAD commands) Host: www-net.cs.umass.edu\r\n
User-Agent: Firefox/3.6.10\r\n
Accept: text/html,application/xhtml+xml\r\n
header Accept-Language: en-us,en;q=0.5\r\n
lines Accept-Encoding: gzip,deflate\r\n
carriage return, Accept-Charset: ISO-8859-1,utf-8;q=0.7\r\n
line feed at start Keep-Alive: 115\r\n
Connection: keep-alive\r\n
of line indicates \r\n
end of header lines
* Check out the online interactive exercises for more
examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
© Department of Networked Systems and Services 27
HTTP request message:
general format

method sp URL sp version cr lf request


line
header field name value cr lf
header
~
~ ~
~ lines

header field name value cr lf


cr lf

~
~ entity body ~
~ body

© Department of Networked Systems and Services 28


Uploading form input
POST method:
▪ web page often includes
form input
▪ input is uploaded to server
in entity body
URL method:
▪ uses GET method
▪ input is uploaded in URL
field of request line:
www.somesite.com/animalsearch?monkeys&banana

© Department of Networked Systems and Services 29


Method types

HTTP/1.0: HTTP/1.1:
• GET • GET, POST, HEAD
• POST • PUT
• HEAD – uploads file in entity
– asks server to leave body to path
requested object out specified in URL field
of response • DELETE
– deletes file specified
in the URL field

© Department of Networked Systems and Services 30


HTTP response message
status line
(protocol
status code HTTP/1.1 200 OK\r\n
status phrase) Date: Sun, 26 Sep 2010 20:09:20 GMT\r\n
Server: Apache/2.0.52 (CentOS)\r\n
Last-Modified: Tue, 30 Oct 2007 17:00:02
GMT\r\n
header ETag: "17dc6-a5c-bf716880"\r\n
Accept-Ranges: bytes\r\n
lines Content-Length: 2652\r\n
Keep-Alive: timeout=10, max=100\r\n
Connection: Keep-Alive\r\n
Content-Type: text/html; charset=ISO-8859-
1\r\n
data, e.g., \r\n
requested data data data data data ...
HTML file
* Check out the online interactive exercises for more
examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
© Department of Networked Systems and Services 31
HTTP response status codes
▪ status code appears in 1st line in server-to-
client response message.
▪ some sample codes:
200 OK
– request succeeded, requested object later in this msg
301 Moved Permanently
– requested object moved, new location specified later in this
msg (Location:)
400 Bad Request
– request msg not understood by server
404 Not Found
– requested document not found on this server
505 HTTP Version Not Supported

© Department of Networked Systems and Services 32


Trying out HTTP (client side)
for yourself
1. Telnet to your favorite Web server:
telnet gaia.cs.umass.edu 80 opens TCP connection to port 80
(default HTTP server port)
at gaia.cs.umass. edu.
anything typed in will be sent
to port 80 at gaia.cs.umass.edu

2. type in a GET HTTP request:


GET /kurose_ross/interactive/index.php HTTP/1.1
Host: gaia.cs.umass.edu by typing this in (hit carriage
return twice), you send
this minimal (but complete)
GET request to HTTP server

3. look at response message sent by HTTP server!


(or use Wireshark to look at captured HTTP request/response)

© Department of Networked Systems and Services 33


User-server state: cookies

many Web sites use example:


cookies ▪ Susan always access Internet
four components: from PC
1) cookie header line of ▪ visits specific e-commerce
HTTP response site for first time
message
2) cookie header line in ▪ when initial HTTP requests
next HTTP request arrives at site, site creates:
message • unique ID
3) cookie file kept on
user’s host, • entry in backend
managed by user’s database for ID
browser
4) back-end database
at Web site
© Department of Networked Systems and Services 34
Cookies: keeping “state” (cont.)
client server

ebay 8734
usual http request msg Amazon server
cookie file creates ID
usual http response
1678 for user create backend
ebay 8734
set-cookie: 1678 entry database
amazon 1678
usual http request msg
cookie: 1678 cookie- access
specific
usual http response msg action

one week later:


access
ebay 8734 usual http request msg
amazon 1678 cookie: 1678 cookie-
specific
usual http response msg action
© Department of Networked Systems and Services 35
Cookies (continued)
aside
what cookies can be cookies and privacy:
used for:
• authorization ▪ cookies permit sites to
learn a lot about you
• shopping carts
• recommendations ▪ you may supply name and
• user session state (Web e-mail to sites
e-mail)

how to keep “state”:


▪ protocol endpoints: maintain state at
sender/receiver over multiple
transactions
▪ cookies: http messages carry state

© Department of Networked Systems and Services 36


Web caches (proxy server)
goal: satisfy client request without involving origin server
• user sets browser: Web
accesses via cache
• browser sends all HTTP proxy
requests to cache server
client
origin
– object in cache: server
cache returns object
– else cache requests
object from origin
server, then returns client origin
object to client server

© Department of Networked Systems and Services 37


More about Web caching

• cache acts as both why Web caching?


client and server • reduce response time
– server for original for client request
requesting client
– client to origin server
• reduce traffic on an
institution’s access link
• typically cache is
• Internet dense with
installed by ISP
caches: enables “poor”
(university, content providers to
company, residential effectively deliver
ISP) content (so too does
P2P file sharing)
© Department of Networked Systems and Services 38
Caching example:
assumptions:
▪ avg object size: 100K bits origin
▪ avg request rate from browsers to servers
origin servers:15/sec public
▪ avg data rate to browsers: 1.50 Mbps Internet
▪ RTT from institutional router to any
origin server: 2 sec
▪ access link rate: 1.54 Mbps 1.54 Mbps
consequences: access link

▪ LAN utilization: 15% problem! institutional


network
▪ access link utilization = 99% 1 Gbps LAN
▪ total delay = Internet delay + access
delay + LAN delay
= 2 sec + minutes + usecs

© Department of Networked Systems and Services 39


Caching example: fatter
access link
assumptions:
▪ avg object size: 100K bits origin
▪ avg request rate from browsers to servers
origin servers:15/sec public
▪ avg data rate to browsers: 1.50 Mbps Internet
▪ RTT from institutional router to any
origin server: 2 sec
▪ access link rate: 1.54 Mbps
154 Mbps 1.54 Mbps
154 Mbps
consequences: access link

▪ LAN utilization: 15% institutional


▪ access link utilization = 99% 9.9% network
1 Gbps LAN
▪ total delay = Internet delay + access
delay + LAN delay
= 2 sec + minutes + usecs
msecs

Cost: increased access link speed (not cheap!)


© Department of Networked Systems and Services 40
Caching example: install local
cache
assumptions:
▪ avg object size: 100K bits origin
▪ avg request rate from browsers to servers
origin servers:15/sec public
▪ avg data rate to browsers: 1.50 Mbps Internet
▪ RTT from institutional router to any
origin server: 2 sec
▪ access link rate: 1.54 Mbps 1.54 Mbps
consequences: access link

▪ LAN utilization: 15% institutional


▪ access link utilization = 100% network
? 1 Gbps LAN
▪ total delay = Internet
? delay + access
delay + LAN delay local web
How to compute link
= 2 sec + minutes + usecs cache
utilization, delay?
Cost: web cache (cheap!)
© Department of Networked Systems and Services 41
Caching example: install local
cache
Calculating access link
utilization, delay with cache:
• suppose cache hit rate is 0.4 origin
– 40% requests satisfied at cache, 60% servers
requests satisfied at origin
public
Internet

▪ access link utilization:


▪ 60% of requests use access link
▪ data rate to browsers over access link 1.54 Mbps
access link
= 0.6*1.50 Mbps = .9 Mbps
institutional
▪ utilization = 0.9/1.54 = .58
network
1 Gbps LAN
▪ total delay
▪ = 0.6 * (delay from origin servers) +0.4 local web
* (delay when satisfied at cache) cache
▪ = 0.6 (2.01) + 0.4 (~msecs) = ~ 1.2 secs
▪ less than with 154 Mbps link (and
cheaper too!)
© Department of Networked Systems and Services 42
Conditional GET
client server
• Goal: don’t send object if
cache has up-to-date
cached version HTTP request msg
object
– no object transmission If-modified-since: <date>
not
delay modified
– lower link utilization HTTP response
before
HTTP/1.0
• cache: specify date of 304 Not Modified
<date>
cached copy in HTTP
request
If-modified-since:
<date> HTTP request msg
• server: response contains If-modified-since: <date> object
no object if cached copy modified
is up-to-date: HTTP response after
HTTP/1.0 200 OK <date>
HTTP/1.0 304 Not
Modified <data>

© Department of Networked Systems and Services 43


Chapter 2: outline

2.1 principles of network 2.5 P2P applications


applications 2.6 video streaming
2.2 Web and HTTP and content
2.3 electronic mail distribution
• SMTP, POP3, IMAP networks
2.4 DNS

© Department of Networked Systems and Services 44


Electronic mail
outgoing
message queue
user mailbox
Three major user
components: agent

• user agents mail user


• mail servers server agent
• simple mail transfer SMTP
protocol: SMTP mail user
server agent

User Agent SMTP


• a.k.a. “mail reader” SMTP user
agent
• composing, editing, mail
reading mail messages server
user
• e.g., Outlook, Thunderbird, agent
iPhone mail client
user
• outgoing, incoming agent
messages stored on server

© Department of Networked Systems and Services 45


Electronic mail: mail servers

mail servers: user


agent
• mailbox contains
mail
incoming messages for server
user
agent
user
• message queue of SMTP mail user
outgoing (to be sent) server agent

mail messages SMTP


• SMTP protocol between SMTP user
mail servers to send agent
mail
email messages server
user
– client: sending mail agent
server user
– “server”: receiving agent

mail server
© Department of Networked Systems and Services 46
Electronic Mail: SMTP [RFC 2821]

• uses TCP to reliably transfer email message


from client to server, port 25
• direct transfer: sending server to receiving
server
• three phases of transfer
– handshaking (greeting)
– transfer of messages
– closure
• command/response interaction (like HTTP)
– commands: ASCII text
– response: status code and phrase
• messages must be in 7-bit ASCI

© Department of Networked Systems and Services 47


Scenario: Alice sends message to
Bob
1) Alice uses UA to compose 4) SMTP client sends Alice’s
message “to” message over the TCP
[email protected] connection
2) Alice’s UA sends message 5) Bob’s mail server places
to her mail server; message the message in Bob’s
placed in message queue mailbox
3) client side of SMTP opens 6) Bob invokes his user
TCP connection with Bob’s agent to read message
mail server

1 user mail user


mail agent
agent server server
2 3 6
4
5
Alice’s mail server Bob’s mail server
© Department of Networked Systems and Services 48
Sample SMTP interaction
S: 220 hamburger.edu
C: HELO crepes.fr
S: 250 Hello crepes.fr, pleased to meet you
C: MAIL FROM: <[email protected]>
S: 250 [email protected]... Sender ok
C: RCPT TO: <[email protected]>
S: 250 [email protected] ... Recipient ok
C: DATA
S: 354 Enter mail, end with "." on a line by itself
C: Do you like ketchup?
C: How about pickles?
C: .
S: 250 Message accepted for delivery
C: QUIT
S: 221 hamburger.edu closing connection

© Department of Networked Systems and Services 49


Try SMTP interaction for yourself:

• telnet servername 25
• see 220 reply from server
• enter HELO, MAIL FROM, RCPT TO, DATA, QUIT
commands

above lets you send email without using email client


(reader)

© Department of Networked Systems and Services 50


SMTP: final words

• SMTP uses persistent comparison with HTTP:


connections • HTTP: pull
• SMTP requires • SMTP: push
message (header &
body) to be in 7-bit • both have ASCII
ASCII command/response
interaction, status codes
• SMTP server uses
CRLF.CRLF to • HTTP: each object
determine end of encapsulated in its own
message response message
• SMTP: multiple objects
sent in multipart message

© Department of Networked Systems and Services 51


Mail message format

SMTP: protocol for


exchanging email header
messages blank
RFC 822: standard for text line
message format:
• header lines, e.g.,
– To: body
– From:
– Subject:
different from SMTP MAIL
FROM, RCPT TO:
commands!
• Body: the “message”
– ASCII characters only

© Department of Networked Systems and Services 52


Mail access protocols
user
mail access user
SMTP SMTP protocol
agent agent
(e.g., POP,
IMAP)

sender’s mail receiver’s mail


server server

• SMTP: delivery/storage to receiver’s server


• mail access protocol: retrieval from server
– POP: Post Office Protocol [RFC 1939]:
authorization, download
– IMAP: Internet Mail Access Protocol [RFC 1730]:
more features, including manipulation of stored
messages on server
– HTTP: gmail, Hotmail, Yahoo! Mail, etc.
© Department of Networked Systems and Services 53
POP3 protocol
S: +OK POP3 server ready
C: user bob
authorization phase S: +OK
C: pass hungry
• client commands: S: +OK user successfully logged on
– user: declare username
– pass: password C: list
S: 1 498
• server responses
S: 2 912
– +OK S: .
– -ERR C: retr 1
transaction phase, client: S: <message 1 contents>
S: .
• list: list message numbers
C: dele 1
• retr: retrieve message by C: retr 2
number S: <message 1 contents>
• dele: delete S: .
• quit C: dele 2
C: quit
S: +OK POP3 server signing off

© Department of Networked Systems and Services 54


POP3 (more) and IMAP
more about POP3 IMAP
• previous example uses • keeps all messages in
POP3 “download and one place: at server
delete” mode • allows user to organize
– Bob cannot re-read messages in folders
e-mail if he changes • keeps user state across
client sessions:
• POP3 “download-and- – names of folders and
keep”: copies of mappings between
messages on different message IDs and
clients folder name
• POP3 is stateless
across sessions
© Department of Networked Systems and Services 55
Chapter 2: outline

2.1 principles of network 2.5 P2P applications


applications 2.6 video streaming
2.2 Web and HTTP and content
2.3 electronic mail distribution
• SMTP, POP3, IMAP networks
2.4 DNS

© Department of Networked Systems and Services 56


DNS: domain name system

people: many identifiers: Domain Name System:


– SSN, name, passport • distributed database
# implemented in hierarchy of
Internet hosts, routers: many name servers
– IP address (32 bit) - • application-layer protocol:
used for addressing hosts, name servers
datagrams
communicate to resolve
– “name”, e.g., names (address/name
www.yahoo.com -
used by humans
translation)
– note: core Internet
Q: how to map between IP function, implemented as
address and name, and application-layer protocol
vice versa ? – complexity at network’s
“edge”

© Department of Networked Systems and Services 57


DNS: services, structure

DNS services why not centralize DNS?


• hostname to IP address • single point of failure
translation • traffic volume
• host aliasing • distant centralized database
– canonical, alias names • maintenance
• mail server aliasing A: doesn‘t scale!
• load distribution
– replicated Web
servers: many IP
addresses
correspond to one
name

© Department of Networked Systems and Services 58


DNS: a distributed, hierarchical
database
Root DNS Servers

… …

com DNS servers org DNS servers edu DNS servers

pbs.org poly.edu umass.edu


yahoo.com amazon.com
DNS servers DNS serversDNS servers
DNS servers DNS servers

client wants IP for www.amazon.com; 1st approximation:


• client queries root server to find com DNS server
• client queries .com DNS server to get amazon.com DNS
server
• client queries amazon.com DNS server to get IP address for
www.amazon.com

© Department of Networked Systems and Services 59


DNS: root name servers

• contacted by local name server that can not resolve name


• root name server:
– contacts authoritative name server if name mapping not known
– gets mapping
– returns mapping to local name server

c. Cogent, Herndon, VA (5 other sites)


d. U Maryland College Park, MD k. RIPE London (17 other sites)
h. ARL Aberdeen, MD
j. Verisign, Dulles VA (69 other sites ) i. Netnod, Stockholm (37 other sites)

e. NASA Mt View, CA m. WIDE Tokyo


f. Internet Software C. (5 other sites)
Palo Alto, CA (and 48 other
sites)

a. Verisign, Los Angeles CA


13 logical root name
(5 other sites)
b. USC-ISI Marina del Rey, CA
“servers” worldwide
l. ICANN Los Angeles, CA •each “server” replicated
(41 other sites)
g. US DoD Columbus, many times
OH (5 other sites)

© Department of Networked Systems and Services 60


TLD, authoritative servers

top-level domain (TLD) servers:


– responsible for com, org, net, edu, aero, jobs,
museums, and all top-level country domains, e.g.: uk,
fr, ca, jp
– Network Solutions maintains servers for .com TLD
– Educause for .edu TLD
authoritative DNS servers:
– organization’s own DNS server(s), providing
authoritative hostname to IP mappings for
organization’s named hosts
– can be maintained by organization or service
provider

© Department of Networked Systems and Services 61


Local DNS name server

• does not strictly belong to hierarchy


• each ISP (residential ISP, company, university)
has one
– also called “default name server”
• when host makes DNS query, query is sent to
its local DNS server
– has local cache of recent name-to-address
translation pairs (but may be out of date!)
– acts as proxy, forwards query into hierarchy

© Department of Networked Systems and Services 62


DNS name
root DNS server
resolution example
2
3
• host at cis.poly.edu TLD DNS server
wants IP address for 4
gaia.cs.umass.edu 5

iterated query: local DNS server


dns.poly.edu
▪ contacted server 7 6
1 8
replies with name of
server to contact
authoritative DNS server
▪ “I don’t know this dns.cs.umass.edu
name, but ask this requesting host
server” cis.poly.edu

gaia.cs.umass.edu

© Department of Networked Systems and Services 63


DNS name
root DNS server
resolution example
2 3
recursive query: 7
6
▪ puts burden of name TLD DNS
server
resolution on
contacted name local DNS server
server dns.poly.edu 5 4

▪ heavy load at upper 1 8


levels of hierarchy?
authoritative DNS server
dns.cs.umass.edu
requesting host
cis.poly.edu

gaia.cs.umass.edu

© Department of Networked Systems and Services 64


DNS: caching, updating
records
• once (any) name server learns mapping, it caches
mapping
– cache entries timeout (disappear) after some time (TTL)
– TLD servers typically cached in local name servers
• thus root name servers not often visited
• cached entries may be out-of-date (best effort
name-to-address translation!)
– if name host changes IP address, may not be known
Internet-wide until all TTLs expire
• update/notify mechanisms proposed IETF
standard
– RFC 2136
© Department of Networked Systems and Services 65
DNS records

DNS: distributed database storing resource records


(RR) RR format: (name, value, type, ttl)

type=A type=CNAME
▪ name is hostname ▪ name is alias name for some
▪ value is IP address “canonical” (the real) name
▪ www.ibm.com is really
type=NS servereast.backup2.ibm.com
– name is domain (e.g.,
▪ value is canonical name
foo.com)
– value is hostname of type=MX
authoritative name
▪ value is name of mailserver
server for this domain
associated with name

© Department of Networked Systems and Services 66


DNS protocol, messages

• query and reply messages, both with same


message format 2 bytes 2 bytes

message header identification flags

▪ identification: 16 bit # for # questions # answer RRs


query, reply to query uses
# authority RRs # additional RRs
same #
▪ flags: questions (variable # of questions)
▪ query or reply
▪ recursion desired answers (variable # of RRs)
▪ recursion available
▪ reply is authoritative authority (variable # of RRs)

additional info (variable # of RRs)

© Department of Networked Systems and Services 67


DNS protocol, messages

2 bytes 2 bytes

identification flags

# questions # answer RRs

# authority RRs # additional RRs

name, type fields


questions (variable # of questions)
for a query
RRs in response answers (variable # of RRs)
to query
records for
authority (variable # of RRs)
authoritative servers
additional “helpful” additional info (variable # of RRs)
info that may be used
© Department of Networked Systems and Services 68
Inserting records into DNS
• example: new startup “Network Utopia”
• register name networkuptopia.com at DNS
registrar (e.g., Network Solutions)
– provide names, IP addresses of authoritative name
server (primary and secondary)
– registrar inserts two RRs into .com TLD server:
(networkutopia.com, dns1.networkutopia.com, NS)
(dns1.networkutopia.com, 212.212.212.1, A)

• create authoritative server type A record for


www.networkuptopia.com; type MX record for
networkutopia.com

© Department of Networked Systems and Services 69


Attacking DNS
DDoS attacks redirect attacks
• bombard root servers ▪ man-in-middle
with traffic • Intercept queries
– not successful to date ▪ DNS poisoning
– traffic filtering ▪ Send bogus relies to
DNS server, which
– local DNS servers caches
cache IPs of TLD
servers, allowing root
exploit DNS for DDoS
server bypass ▪ send queries with
spoofed source
• bombard TLD servers address: target IP
– potentially more
dangerous ▪ requires amplification

© Department of Networked Systems and Services 70


Chapter 2: outline

2.1 principles of network 2.5 P2P applications


applications 2.6 video streaming
2.2 Web and HTTP and content
2.3 electronic mail distribution
• SMTP, POP3, IMAP networks
2.4 DNS

© Department of Networked Systems and Services 71


Pure P2P architecture
• no always-on server
• arbitrary end systems
directly communicate
• peers are intermittently
connected and change
IP addresses
examples:
– file distribution
(BitTorrent)
– Streaming (KanKan)
– VoIP (Skype)

© Department of Networked Systems and Services 72


File distribution: client-server vs P2P

Question: how much time to distribute file (size F)


from one server to N peers?
– peer upload/download capacity is limited resource

us: server upload


capacity

di: peer i download


file, size F u1 d1 capacity
us u2 d2
server
di
uN network (with abundant
bandwidth) ui
dN
ui: peer i upload
capacity

© Department of Networked Systems and Services 73


File distribution time: client-
server
• server transmission: must
sequentially send (upload) N F
us
file copies: di
– time to send one copy: F/us network
– time to send N copies: NF/us ui

▪ client: each client must


download file copy
• dmin = min client download rate
• min client download time: F/dmin

time to distribute F
to N clients using Dc-s > max{NF/us,,F/dmin}
client-server approach

increases linearly in N

© Department of Networked Systems and Services 74


File distribution time: P2P

• server transmission: must


F
upload at least one copy us
– time to send one copy: F/us di
▪ client: each client must network
download file copy ui
• min client download time: F/dmin
▪ clients: as aggregate must download NF bits
• max upload rate (limiting max download rate) is us + Sui

time to distribute F
to N clients using DP2P > max{F/us,,F/dmin,,NF/(us + Sui)}
P2P approach

increases linearly in N …
… but so does this, as each peer brings service capacity
© Department of Networked Systems and Services 75
Client-server vs. P2P: example

client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us

3.5
P2P
Minimum Distribution Time

3
Client-Server
2.5

1.5

0.5

0
0 5 10 15 20 25 30 35

© Department of Networked Systems and Services 76


P2P file distribution: BitTorrent
▪ file divided into 256Kb chunks
▪ peers in torrent send/receive file chunks
tracker: tracks peers torrent: group of peers
participating in torrent exchanging chunks of a file

Alice arrives …
… obtains list
of peers from tracker
… and begins exchanging
file chunks with peers in torrent

© Department of Networked Systems and Services 77


P2P file distribution: BitTorrent

• peer joining torrent:


– has no chunks, but will
accumulate them over time
from other peers
– registers with tracker to get
list of peers, connects to
subset of peers
(“neighbors”)

▪ while downloading, peer uploads chunks to other peers


▪ peer may change peers with whom it exchanges chunks
▪ churn: peers may come and go
▪ once peer has entire file, it may (selfishly) leave or
(altruistically) remain in torrent

© Department of Networked Systems and Services 78


BitTorrent: requesting, sending file
chunks
requesting chunks: sending chunks: tit-for-tat
▪ at any given time, different ▪ Alice sends chunks to those
four peers currently sending her
peers have different subsets chunks at highest rate
of file chunks • other peers are choked by Alice
▪ periodically, Alice asks each (do not receive chunks from her)
peer for list of chunks that • re-evaluate top 4 every10 secs
they have ▪ every 30 secs: randomly select
another peer, starts sending
▪ Alice requests missing chunks
chunks from peers, rarest • “optimistically unchoke” this peer
first • newly chosen peer may join top 4

© Department of Networked Systems and Services 79


BitTorrent: tit-for-tat
(1) Alice “optimistically unchokes” Bob
(2) Alice becomes one of Bob’s top-four providers; Bob reciprocates
(3) Bob becomes one of Alice’s top-four providers

higher upload rate: find better


trading partners, get file faster !

© Department of Networked Systems and Services 80


Chapter 2: outline

2.1 principles of network 2.5 P2P applications


applications 2.6 video streaming
2.2 Web and HTTP and content
2.3 electronic mail distribution
• SMTP, POP3, IMAP networks (CDNs)
2.4 DNS

© Department of Networked Systems and Services 81


Video Streaming and CDNs:
context
▪ video traffic: major consumer of Internet bandwidth
• Netflix, YouTube: 37%, 16% of downstream
residential ISP traffic
• ~1B YouTube users, ~75M Netflix users
▪ challenge: scale - how to reach ~1B
users?
• single mega-video server won’t work (why?)
▪ challenge: heterogeneity
▪ different users have different capabilities (e.g.,
wired versus mobile; bandwidth rich versus
bandwidth poor)
▪ solution: distributed, application-level
infrastructure

© Department of Networked Systems and Services 82


Multimedia: video spatial coding example: instead
of sending N values of same
color (all purple), send only two
values: color value (purple) and
number of repeated values (N)
• video: sequence of images
displayed at constant rate ……………………..
……………….…….
– e.g., 24 images/sec
• digital image: array of
pixels
– each pixel represented
by bits
• coding: use redundancy frame i
within and between images
to decrease # bits used to
encode image
temporal coding example:
– spatial (within image) instead of sending
complete frame at i+1,
– temporal (from one send only differences from
image to next) frame i frame i+1

© Department of Networked Systems and Services 83


Multimedia: video spatial coding example: instead
of sending N values of same
color (all purple), send only two
values: color value (purple) and

▪ CBR: (constant bit rate): number of repeated values (N)

video encoding rate fixed ……………………..


……………….…….
▪ VBR: (variable bit rate):
video encoding rate changes
as amount of spatial,
temporal coding changes
▪ examples:
• MPEG 1 (CD-ROM) 1.5 frame i
Mbps
• MPEG2 (DVD) 3-6 Mbps
• MPEG4 (often used in temporal coding example:
instead of sending
Internet, < 1 Mbps) complete frame at i+1,
send only differences from
frame i frame i+1

© Department of Networked Systems and Services 84


Streaming stored video:

simple scenario:

Internet

video server client


(stored video)

© Department of Networked Systems and Services 85


Streaming multimedia: DASH
• DASH: Dynamic, Adaptive Streaming over
HTTP
• server:
– divides video file into multiple chunks
– each chunk stored, encoded at different rates
– manifest file: provides URLs for different chunks
• client:
– periodically measures server-to-client bandwidth
– consulting manifest, requests one chunk at a time
• chooses maximum coding rate sustainable given current
bandwidth
• can choose different coding rates at different points in time
(depending on available bandwidth at time)

© Department of Networked Systems and Services 86


Streaming multimedia: DASH

• DASH: Dynamic, Adaptive Streaming


over HTTP
• “intelligence” at client: client determines
– when to request chunk (so that buffer
starvation, or overflow does not occur)
– what encoding rate to request (higher
quality when more bandwidth available)
– where to request chunk (can request from
URL server that is “close” to client or has
high available bandwidth)
© Department of Networked Systems and Services 87
Content distribution networks
• challenge: how to stream content (selected
from millions of videos) to hundreds of
thousands of simultaneous users?

• option 1: single, large “mega-server”


– single point of failure
– point of network congestion
– long path to distant clients
– multiple copies of video sent over outgoing link

….quite simply: this solution doesn’t scale


© Department of Networked Systems and Services 88
Content distribution networks
• challenge: how to stream content (selected from
millions of videos) to hundreds of thousands of
simultaneous users?

• option 2: store/serve multiple copies of videos at


multiple geographically distributed sites (CDN)
– enter deep: push CDN servers deep into many access
networks
• close to users
• used by Akamai, 1700 locations
– bring home: smaller number (10’s) of larger clusters in
POPs near (but not within) access networks
• used by Limelight

© Department of Networked Systems and Services 89


Content Distribution Networks (CDNs)
▪ CDN: stores copies of content at CDN nodes
• e.g. Netflix stores copies of MadMen
▪ subscriber requests content from CDN
• directed to nearby copy, retrieves content
• may choose different copy if network path congested

manifest file
where’s Madmen?
Content Distribution Networks (CDNs)

“over the top”

Internet host-host communication as a service


OTT challenges: coping with a congested Internet
▪ from which CDN node to retrieve content?
▪ viewer behavior in presence of congestion?
▪ what content to place in which CDN node?
CDN content access: a closer look
Bob (client) requests video http://netcinema.com/6Y7B23V
▪ video stored in CDN at http://KingCDN.com/NetC6y&B23V

1. Bob gets URL for video


http://netcinema.com/6Y7B23V
from netcinema.com web page 2. resolve http://netcinema.com/6Y7B23V
2 via Bob’s local DNS
1
6. request video from 5 Bob’s
KINGCDN server, local DNS
streamed via HTTP server
3. netcinema’s DNS returns URL 4&5. Resolve
netcinema.com 4 http://KingCDN.com/NetC6y&B23
http://KingCDN.com/NetC6y&B23V
via KingCDN’s authoritative DNS,
3 which returns IP address of KingCDN
server with video
netcinema’s
authoratative DNS KingCDN.com KingCDN
authoritative DNS
Case study: Netflix

Amazon cloud upload copies of


multiple versions of
video to CDN servers
CDN
server
Netflix registration,
accounting servers
3. Manifest file
2. Bob browses returned for
CDN
Netflix video 2 requested video server
3
1

1. Bob manages
Netflix account CDN
server

4. DASH
streaming
Chapter 2: summary
our study of network apps now complete!
▪ application architectures ▪ specific protocols:
• client-server • HTTP
• P2P • SMTP, POP, IMAP
▪ application service • DNS
requirements: • P2P: BitTorrent
• reliability, bandwidth, delay ▪ video streaming, CDNs
▪ Internet transport service
model
• connection-oriented,
reliable: TCP
• unreliable, datagrams: UDP
© Department of Networked Systems and Services 94
Chapter 2: summary
most importantly: learned about protocols!

• typical request/reply important themes:


message exchange: ▪ control vs. messages
– client requests info or • in-band, out-of-band
service
▪ centralized vs. decentralized
– server responds with
▪ stateless vs. stateful
data, status code
▪ reliable vs. unreliable message
• message formats:
transfer
– headers: fields giving
▪ “complexity at network
info about data
edge”
– data: info(payload)
being communicated
© Department of Networked Systems and Services 95
Chapter 3
Transport Layer

A note on the use of these Powerpoint slides:


We’re making these slides freely available to all (faculty, students, readers).
They’re in PowerPoint form so you see the animations; and can add, modify,
and delete slides (including this one) and slide content to suit your needs.

Computer
They obviously represent a lot of work on our part. In return for use, we only
ask the following:

▪ If you use these slides (e.g., in a class) that you mention their source
(after all, we’d like people to use our book!)
Networking: A Top
▪ If you post any slides on a www site, that you note that they are adapted
from (or perhaps identical to) our slides, and note our copyright of this Down Approach
material.
7th edition
Thanks and enjoy! JFK/KWR
Jim Kurose, Keith Ross
All material copyright 1996-2016 Pearson/Addison Wesley
J.F Kurose and K.W. Ross, All Rights Reserved April 2016

© Department of Networked Systems and Services 1


Chapter 3: Transport Layer

our goals:
▪ understand ▪ learn about Internet
principles behind transport layer
transport layer protocols:
services: • UDP: connectionless
• multiplexing, transport
demultiplexing • TCP: connection-
• reliable data oriented reliable
transfer transport
• flow control • TCP congestion control
• congestion control
© Department of Networked Systems and Services 2
Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 3
Transport services and
protocols
application
transport
▪ provide logical network
data link
communication between physical
app processes running on
different hosts
▪ transport protocols run in
end systems
• send side: breaks app
messages into
segments, passes to
network layer
• rcv side: reassembles application
transport
segments into network
data link
messages, passes to physical
app layer
▪ more than one transport
protocol available to apps
• Internet: TCP and UDP
© Department of Networked Systems and Services 4
Transport vs. network
layer
▪ network layer: household analogy:
logical
communication 12 kids in Ann’s house
sending letters to 12 kids
between hosts in Bill’s house:
▪ transport layer: • hosts = houses
logical • processes = kids
communication • app messages = letters
in envelopes
between • transport protocol = Ann
processes and Bill who demux to in-
• relies on, house siblings
enhances, • network-layer protocol =
network layer postal service
services

© Department of Networked Systems and Services 5


Internet transport-layer
protocols
application

• reliable, in-order transport


network
data link
delivery (TCP) physical
network
network data link
– congestion control data link
physical
physical

network
– flow control data link
physical

– connection setup network


data link

• unreliable, unordered physical


network
data link
delivery: UDP network
physical

data link application


– no-frills extension of physical
network transport
data link network
“best-effort” IP physical data link
physical

• services not available:


– delay guarantees
– bandwidth guarantees
© Department of Networked Systems and Services 6
Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 7
Multiplexing/demultiplexing

multiplexing at sender:
handle data from multiple demultiplexing at receiver:
sockets, add transport header use header info to deliver
(later used for demultiplexing) received segments to correct
socket

application

application P1 P2 application socket


P3 transport P4 process
transport network transport
network link network
link physical link
physical physical

© Department of Networked Systems and Services 8


How demultiplexing works

▪ host receives IP 32 bits


datagrams
source port # dest port #
• each datagram has source
IP address, destination IP
address other header fields
• each datagram carries one
transport-layer segment
• each segment has source, application
destination port number data
▪ host uses IP addresses & (payload)
port numbers to direct
segment to appropriate
socket TCP/UDP segment format

© Department of Networked Systems and Services 9


Connectionless demultiplexing

▪ recall: created socket has ▪ recall: when creating


host-local port #: datagram to send into UDP
DatagramSocket mySocket1 socket, must specify
= new DatagramSocket(12534); • destination IP address
• destination port #

▪ when host receives IP datagrams with same


UDP segment: dest. port #, but different
source IP addresses
• checks destination port # and/or source port
in segment numbers will be directed
• directs UDP segment to to same socket at dest
socket with that port #

© Department of Networked Systems and Services 10


Connectionless demux:
example
DatagramSocket
DatagramSocket serverSocket = new DatagramSocket
mySocket2 = new DatagramSocket mySocket1 = new
DatagramSocket DatagramSocket
(9157); (6428); (5775);
application
application application
P1
P3 P4
transport
transport transport
network
network link network
link physical link
physical physical

source port: 6428 source port: ?


dest port: 9157 dest port: ?

source port: 9157 source port: ?


dest port: 6428 dest port: ?

© Department of Networked Systems and Services 11


Connection-oriented demux

▪ TCP socket identified ▪ server host may support


by 4-tuple: many simultaneous
• source IP address TCP sockets:
• each socket identified by
• source port number
its own 4-tuple
• dest IP address
▪ web servers have
• dest port number
different sockets for
▪ demux: receiver uses each connecting client
all four values to • non-persistent HTTP will
direct segment to have different socket for
appropriate socket each request

© Department of Networked Systems and Services 12


Connection-oriented demux:
example

application
application application
P4 P5 P6
P3 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: IP physical
address B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: A,9157 source IP,port: C,5775 address C
dest IP,port: B,80
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80
three segments, all destined to IP address: B,
dest port: 80 are demultiplexed to different sockets
© Department of Networked Systems and Services 13
Connection-oriented demux:
example
threaded server
application
application application
P3 P4 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: IP physical
address B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: A,9157 source IP,port: C,5775 address C
dest IP,port: B,80
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80

© Department of Networked Systems and Services 14


Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 15
UDP: User Datagram
Protocol [RFC 768]
• “no frills,” “bare bones” ▪ UDP use:
Internet transport protocol
▪ streaming multimedia
• “best effort” service, UDP apps (loss tolerant, rate
segments may be:
sensitive)
– lost
– delivered out-of-order ▪ DNS
to app ▪ SNMP
• connectionless: ▪ reliable transfer over
– no handshaking UDP:
between UDP sender,
receiver ▪ add reliability at
– each UDP segment application layer
handled independently ▪ application-specific error
of others recovery!

© Department of Networked Systems and Services 16


UDP: segment header
length, in bytes of
32 bits UDP segment,
source port # dest port # including header

length checksum
why is there a UDP?
▪ no connection
application establishment (which can
data add delay)
(payload) ▪ simple: no connection
state at sender, receiver
▪ small header size
▪ no congestion control:
UDP segment format UDP can blast away as
fast as desired

© Department of Networked Systems and Services 17


UDP checksum

Goal: detect “errors” (e.g., flipped bits) in transmitted


segment
sender: receiver:
• treat segment contents, • compute checksum of received
including header fields, segment
as sequence of 16-bit
integers • check if computed checksum
• checksum: addition equals checksum field value:
(one’s complement – NO - error detected
sum) of segment
contents – YES - no error detected.
• sender puts checksum But maybe errors
value into UDP nonetheless? More later ….
checksum field

© Department of Networked Systems and Services 18


Internet checksum: example

example: add two 16-bit integers


1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1

Note: when adding numbers, a carryout from the most


significant bit needs to be added to the result
* Check out the online interactive exercises for more
examples: http://gaia.cs.umass.edu/kurose_ross/interactive/

© Department of Networked Systems and Services 19


Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 20
Principles of reliable data
transfer
▪ important in application, transport, link layers
• top-10 list of important networking topics!

▪ characteristics of unreliable channel will determine


complexity of reliable data transfer protocol (rdt)
© Department of Networked Systems and Services 21
Principles of reliable data
transfer
▪ important in application, transport, link layers
• top-10 list of important networking topics!

▪ characteristics of unreliable channel will determine


complexity of reliable data transfer protocol (rdt)
© Department of Networked Systems and Services 22
Principles of reliable data
transfer
▪ important in application, transport, link layers
• top-10 list of important networking topics!

▪ characteristics of unreliable channel will determine


complexity of reliable data transfer protocol (rdt)
© Department of Networked Systems and Services 23
Reliable data transfer: getting
started
rdt_send(): called from above, deliver_data(): called by
(e.g., by app.). Passed data to rdt to deliver data to upper
deliver to receiver upper layer

send receive
side side

udt_send(): called by rdt, rdt_rcv(): called when packet


to transfer packet over arrives on rcv-side of channel
unreliable channel to receiver

© Department of Networked Systems and Services 24


Reliable data transfer: getting
started
we’ll:
• incrementally develop sender, receiver sides of
reliable data transfer protocol (rdt)
• consider only unidirectional data transfer
– but control info will flow on both directions!
• use finite state machines (FSM) to specify
event causing state transition
sender, receiver actions taken on state transition
state: when in this
“state” next state state state
uniquely determined 1 event
by next event 2
actions

© Department of Networked Systems and Services 25


rdt1.0: reliable transfer over a reliable
channel
▪ underlying channel perfectly reliable
• no bit errors
• no loss of packets
▪ separate FSMs for sender, receiver:
• sender sends data into underlying channel
• receiver reads data from underlying channel
Wait for rdt_send(data) Wait for rdt_rcv(packet)
call from call from extract (packet,data)
above packet = make_pkt(data) below deliver_data(data)
udt_send(packet)

sender receiver

© Department of Networked Systems and Services 26


rdt2.0: channel with bit errors

▪ underlying channel may flip bits in packet


• checksum to detect bit errors
▪ the question: how to recover from errors:
• acknowledgements (ACKs): receiver explicitly tells
sender that pkt received OK
• negative acknowledgements (NAKs): receiver
explicitly tells sender that pkt had errors
• sender
Howretransmits
do humans pktrecover from
on receipt “errors”
of NAK
▪ new mechanisms in rdt2.0 (beyond rdt1.0):
during conversation?
• error detection
• receiver feedback: control msgs (ACK,NAK) rcvr-
>sender

© Department of Networked Systems and Services 27


rdt2.0: channel with bit errors

▪ underlying channel may flip bits in packet


• checksum to detect bit errors
▪ the question: how to recover from errors:
• acknowledgements (ACKs): receiver explicitly tells
sender that pkt received OK
• negative acknowledgements (NAKs): receiver
explicitly tells sender that pkt had errors
• sender retransmits pkt on receipt of NAK
▪ new mechanisms in rdt2.0 (beyond rdt1.0):
• error detection
• feedback: control msgs (ACK,NAK) from receiver to
sender

© Department of Networked Systems and Services 28


rdt2.0: FSM specification
rdt_send(data)
sndpkt = make_pkt(data, checksum) receiver
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
L
call from
sender below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

© Department of Networked Systems and Services 29


rdt2.0: operation with no errors
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
L call from
below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

© Department of Networked Systems and Services 30


rdt2.0: error scenario
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
L call from
below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

© Department of Networked Systems and Services 31


rdt2.0 has a fatal flaw!

what happens if handling duplicates:


ACK/NAK • sender retransmits current
corrupted? pkt if ACK/NAK corrupted
• sender doesn’t know • sender adds sequence
what happened at number to each pkt
receiver!
• receiver discards (doesn’t
• can’t just retransmit: deliver up) duplicate pkt
possible duplicate

stop and wait


sender sends one packet,
then waits for receiver
response

© Department of Networked Systems and Services 32


rdt2.1: sender, handles garbled
ACK/NAKs
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK or
isNAK(rcvpkt) )
call 0 from
NAK 0 udt_send(sndpkt)
above
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt) && notcorrupt(rcvpkt)
&& isACK(rcvpkt)
L
L
Wait for Wait for
ACK or call 1 from
rdt_rcv(rcvpkt) && NAK 1 above
( corrupt(rcvpkt) ||
isNAK(rcvpkt) ) rdt_send(data)

udt_send(sndpkt) sndpkt = make_pkt(1, data, checksum)


udt_send(sndpkt)

© Department of Networked Systems and Services 33


rdt2.1: receiver, handles garbled
ACK/NAKs
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum) sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
Wait for Wait for
rdt_rcv(rcvpkt) && 0 from 1 from rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) && below below not corrupt(rcvpkt) &&
has_seq1(rcvpkt) has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, chksum) sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)

extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)

© Department of Networked Systems and Services 34


rdt2.1: discussion

sender: receiver:
• seq # added to pkt ▪ must check if
• two seq. #’s (0,1) will received packet is
suffice. Why? duplicate
• state indicates
• must check if received whether 0 or 1 is
ACK/NAK corrupted expected pkt seq #
• twice as many states ▪ note: receiver can
– state must not know if its last
“remember” whether ACK/NAK received
“expected” pkt should OK at sender
have seq # of 0 or 1

© Department of Networked Systems and Services 35


rdt2.2: a NAK-free protocol

▪ same functionality as rdt2.1, using ACKs only


▪ instead of NAK, receiver sends ACK for last pkt
received OK
• receiver must explicitly include seq # of pkt being
ACKed
▪ duplicate ACK at sender results in same action
as NAK: retransmit current pkt

© Department of Networked Systems and Services 36


rdt2.2: sender, receiver fragments
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK isACK(rcvpkt,1) )
call 0 from
above 0 udt_send(sndpkt)
sender FSM
fragment rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && && isACK(rcvpkt,0)
(corrupt(rcvpkt) || L
has_seq1(rcvpkt)) Wait for receiver FSM
0 from
udt_send(sndpkt) below fragment
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
udt_send(sndpkt)
© Department of Networked Systems and Services 37
rdt3.0: channels with errors and loss

new assumption: approach: sender waits


underlying channel “reasonable” amount of
can also lose time for ACK
• retransmits if no ACK
packets (data, received in this time
ACKs) • if pkt (or ACK) just delayed
(not lost):
– checksum, seq. #,
ACKs, – retransmission will be
retransmissions will duplicate, but seq. #’s
be of help … but not already handles this
enough – receiver must specify seq
# of pkt being ACKed
• requires countdown timer

© Department of Networked Systems and Services 38


rdt3.0 sender
rdt_send(data)
rdt_rcv(rcvpkt) &&
sndpkt = make_pkt(0, data, checksum) ( corrupt(rcvpkt) ||
udt_send(sndpkt) isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) start_timer L
L Wait for Wait
for timeout
call 0from
ACK0 udt_send(sndpkt)
above
start_timer
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt,1) && notcorrupt(rcvpkt)
stop_timer && isACK(rcvpkt,0)
stop_timer
Wait Wait for
timeout for call 1 from
udt_send(sndpkt) ACK1 above
start_timer rdt_rcv(rcvpkt)
rdt_send(data) L
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) || sndpkt = make_pkt(1, data, checksum)
isACK(rcvpkt,0) ) udt_send(sndpkt)
start_timer
L

© Department of Networked Systems and Services 39


rdt3.0 in action

sender receiver sender receiver


send pkt0 pkt0 send pkt0 pkt0
rcv pkt0 rcv pkt0
ack0 send ack0 ack0 send ack0
rcv ack0 rcv ack0
send pkt1 pkt1 send pkt1 pkt1
rcv pkt1 X
ack1 send ack1 loss
rcv ack1
send pkt0 pkt0
rcv pkt0 timeout
ack0 send ack0 resend pkt1 pkt1
rcv pkt1
ack1 send ack1
rcv ack1
send pkt0 pkt0
(a) no loss rcv pkt0
ack0 send ack0

(b) packet loss

© Department of Networked Systems and Services 40


rdt3.0 in action
sender receiver
sender receiver send pkt0 pkt0
send pkt0 pkt0 rcv pkt0
send ack0
rcv pkt0 ack0
send ack0 rcv ack0
ack0 send pkt1 pkt1
rcv ack0 rcv pkt1
send pkt1 pkt1
send ack1
rcv pkt1 ack1
ack1 send ack1
X
loss timeout
resend pkt1 pkt1
rcv pkt1
timeout
resend pkt1 pkt1 rcv ack1 (detect duplicate)
rcv pkt1 send pkt0
pkt0
send ack1
(detect duplicate) ack1
ack1 send ack1 rcv ack1 rcv pkt0
rcv ack1 send pkt0
ack0 send ack0
send pkt0 pkt0 pkt0
rcv pkt0
rcv pkt0 ack0 (detect duplicate)
ack0 send ack0 send ack0

(c) ACK loss (d) premature timeout/ delayed ACK

© Department of Networked Systems and Services 41


Performance of rdt3.0

▪ rdt3.0 is correct, but performance stinks


▪ e.g.: 1 Gbps link, 15 ms prop. delay, 8000 bit
packet: L 8000 bits
=Dtrans = R = 8 microsecs
109 bits/sec

▪ U sender: utilization – fraction of time sender busy sending

U L/R .008
sender = = = 0.00027
RTT + L / R 30.008

▪ if RTT=30 msec, 1KB pkt every 30 msec: 33kB/sec thruput


over 1 Gbps link
▪ network protocol limits use of physical resources!

© Department of Networked Systems and Services 42


rdt3.0: stop-and-wait operation

sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK

ACK arrives, send next


packet, t = RTT + L / R

U L/R .008
sender = = = 0.00027
RTT + L / R 30.008

© Department of Networked Systems and Services 43


Pipelined protocols
pipelining: sender allows multiple, “in-flight”, yet-to-be-
acknowledged pkts
– range of sequence numbers must be increased
– buffering at sender and/or receiver

▪ two generic forms of pipelined protocols: go-


Back-N, selective repeat
© Department of Networked Systems and Services 44
Pipelining: increased utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
3-packet pipelining increases
utilization by a factor of 3!

U 3L / R .0024
sender = = = 0.00081
RTT + L / R 30.008

© Department of Networked Systems and Services 45


Pipelined protocols: overview

Go-back-N: Selective Repeat:


• sender can have up • sender can have up to
to N unacked packets N unack’ed packets in
in pipeline pipeline
• receiver only sends • rcvr sends individual
cumulative ack ack for each packet
– doesn’t ack packet if
there’s a gap
• sender has timer for • sender maintains timer
oldest unacked for each unacked
packet packet
– when timer expires,
– when timer expires, retransmit only that
retransmit all unacked unacked packet
packets
© Department of Networked Systems and Services 46
Go-Back-N: sender

• k-bit seq # in pkt header


• “window” of up to N, consecutive unack’ed pkts allowed

▪ ACK(n): ACKs all pkts up to, including seq # n - “cumulative


ACK”
• may receive duplicate ACKs (see receiver)
▪ timer for oldest in-flight pkt
▪ timeout(n): retransmit packet n and all higher seq # pkts in
window
© Department of Networked Systems and Services 47
GBN: sender extended FSM
rdt_send(data)
if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
L else
refuse_data(data)
base=1
nextseqnum=1
timeout
start_timer
Wait
udt_send(sndpkt[base])
rdt_rcv(rcvpkt) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnum-1])
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_timer
© Department of Networked Systems and Services 48
GBN: receiver extended FSM
default
udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
L && hasseqnum(rcvpkt,expectedseqnum)
expectedseqnum=1 Wait extract(rcvpkt,data)
sndpkt = deliver_data(data)
make_pkt(expectedseqnum,ACK,chksum) sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++

ACK-only: always send ACK for correctly-received pkt


with highest in-order seq #
– may generate duplicate ACKs
– need only remember expectedseqnum
• out-of-order pkt:
– discard (don’t buffer): no receiver buffering!
– re-ACK pkt with highest in-order seq #
© Department of Networked Systems and Services 49
GBN in action
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
012345678 send pkt2 receive pkt0, send ack0
012345678 send pkt3 Xloss receive pkt1, send ack1
(wait)
receive pkt3, discard,
012345678 rcv ack0, send pkt4 (re)send ack1
012345678 rcv ack1, send pkt5 receive pkt4, discard,
(re)send ack1
ignore duplicate ACK receive pkt5, discard,
(re)send ack1
pkt 2 timeout
012345678 send pkt2
012345678 send pkt3
012345678 send pkt4 rcv pkt2, deliver, send ack2
012345678 send pkt5 rcv pkt3, deliver, send ack3
rcv pkt4, deliver, send ack4
rcv pkt5, deliver, send ack5

© Department of Networked Systems and Services 50


Selective repeat

• receiver individually acknowledges all


correctly received pkts
– buffers pkts, as needed, for eventual in-order
delivery to upper layer
• sender only resends pkts for which ACK not
received
– sender timer for each unACKed pkt
• sender window
– N consecutive seq #’s
– limits seq #s of sent, unACKed pkts

© Department of Networked Systems and Services 51


Selective repeat: sender, receiver
windows

© Department of Networked Systems and Services 52


Selective repeat
sender receiver
data from above: pkt n in [rcvbase, rcvbase+N-1]
▪ if next available seq # in ▪ send ACK(n)
window, send pkt ▪ out-of-order: buffer
timeout(n): ▪ in-order: deliver (also
deliver buffered, in-order
▪ resend pkt n, restart timer pkts), advance window to
ACK(n) in next not-yet-received pkt
[sendbase,sendbase+N]: pkt n in [rcvbase-N,rcvbase-1]
▪ mark pkt n as received ▪ ACK(n)
▪ if n smallest unACKed pkt, otherwise:
advance window base to
next unACKed seq # ▪ ignore

© Department of Networked Systems and Services 53


Selective repeat in action
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
012345678 send pkt2 receive pkt0, send ack0
012345678 send pkt3 Xloss receive pkt1, send ack1
(wait)
receive pkt3, buffer,
012345678 rcv ack0, send pkt4 send ack3
012345678 rcv ack1, send pkt5 receive pkt4, buffer,
send ack4
record ack3 arrived receive pkt5, buffer,
send ack5
pkt 2 timeout
012345678 send pkt2
012345678 record ack4 arrived
012345678 rcv pkt2; deliver pkt2,
record ack5 arrived
012345678 pkt3, pkt4, pkt5; send ack2

Q: what happens when ack2 arrives?

© Department of Networked Systems and Services 54


Selective sender window receiver window
repeat: (after receipt) (after receipt)

dilemma 0123012 pkt0


0123012 pkt1 0123012
0123012 pkt2 0123012
example: 0123012 pkt3
0123012

• seq #’s: 0, 1, 2, 3 0123012


X

• window size=3 pkt0 will accept packet


with seq number 0
(a) no problem
▪ receiver sees no
difference in two receiver can’t see sender side.
scenarios! receiver behavior identical in both cases!
something’s (very) wrong!
▪ duplicate data
accepted as new in (b) 0123012 pkt0
0123012 pkt1 0123012
pkt2
Q: what relationship 0123012
X
0123012
0123012
between seq # size X
and window size to timeout
retransmit pkt0 X
avoid problem in (b)? 0123012 pkt0
will accept packet
with seq number 0
(b) oops!

© Department of Networked Systems and Services 55


Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 56
TCP: Overview RFCs:
793,1122,1323, 2018, 2581

• point-to-point: ▪ full duplex data:


– one sender, one • bi-directional data
receiver flow in same
• reliable, in-order byte connection
steam: • MSS: maximum
segment size
– no “message
boundaries” ▪ connection-oriented:
• pipelined: • handshaking
(exchange of control
– TCP congestion and
msgs) inits sender,
flow control set
receiver state before
window size
data exchange
© Department of Networked Systems and Services
▪ flow controlled: 57
TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used
UAP R S F receive window
(generally not used) # bytes
checksum Urg data pointer
rcvr willing
RST, SYN, FIN: to accept
options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)

© Department of Networked Systems and Services 58


TCP seq. numbers, ACKs
outgoing segment from sender
sequence numbers: source port # dest port #
sequence number
–byte stream “number” of acknowledgement number
rwnd
first byte in segment’s checksum urg pointer

data window size


N
acknowledgements:
–seq # of next byte
expected from other side sender sequence number space

–cumulative ACK sent sent, not- usable not


ACKed yet ACKed but not usable
Q: how receiver handles out- (“in- yet sent
of-order segments flight”)
incoming segment to sender
–A: TCP spec doesn’t say, source port # dest port #
sequence number
- up to implementor acknowledgement number
A rwnd
checksum urg pointer

© Department of Networked Systems and Services 59


TCP seq. numbers, ACKs

Host A Host B

User
types
‘C’ Seq=42, ACK=79, data = ‘C’
host ACKs
receipt of
‘C’, echoes
Seq=79, ACK=43, data = ‘C’ back ‘C’
host ACKs
receipt
of echoed
‘C’ Seq=43, ACK=80

simple telnet scenario

© Department of Networked Systems and Services 60


TCP round trip time, timeout

Q: how to set TCP Q: how to estimate RTT?


timeout value? • SampleRTT: measured time
from segment transmission
▪ longer than RTT until ACK receipt
• but RTT varies – ignore retransmissions
▪ too short: • SampleRTT will vary, want
premature timeout, estimated RTT “smoother”
unnecessary – average several recent
retransmissions measurements, not just
current SampleRTT
▪ too long: slow
reaction to segment
loss
© Department of Networked Systems and Services 61
TCP round trip time, timeout

EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT


▪ exponential weighted moving average
▪ influence of past sample decreases exponentially fast
▪ typical value:  = 0.125 RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

350

RTT: gaia.cs.umass.edu to fantasia.eurecom.fr


RTT (milliseconds)

300

250
RTT (milliseconds)

200

sampleRTT
150

EstimatedRTT

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
time (seconds)
© Department of Networked Systems and Services SampleRTT Estimated RTT 62
TCP round trip time, timeout

• timeout interval: EstimatedRTT plus “safety


margin”
– large variation in EstimatedRTT -> larger safety margin
• estimate SampleRTT deviation from EstimatedRTT:
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically,  = 0.25)

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

* Check out the online interactive exercises for more


examples: http://gaia.cs.umass.edu/kurose_ross/interactive/

© Department of Networked Systems and Services 63


Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 64
TCP reliable data transfer

• TCP creates rdt


service on top of IP’s
unreliable service
– pipelined segments let’s initially consider
– cumulative acks simplified TCP
– single retransmission sender:
timer
– ignore duplicate acks
• retransmissions – ignore flow control,
triggered by: congestion control
– timeout events
– duplicate acks
© Department of Networked Systems and Services 65
TCP sender events:
data rcvd from app: timeout:
▪ create segment with ▪ retransmit segment
seq # that caused timeout
▪ seq # is byte-stream ▪ restart timer
number of first data
byte in segment ack rcvd:
▪ start timer if not ▪ if ack acknowledges
already running previously unacked
• think of timer as for segments
oldest unacked
segment • update what is known
to be ACKed
• expiration interval:
TimeOutInterval • start timer if there are
still unacked segments

© Department of Networked Systems and Services 66


TCP sender (simplified)

data received from application above


create segment, seq. #: NextSeqNum
pass segment to IP (i.e., “send”)
NextSeqNum = NextSeqNum + length(data)
if (timer currently not running)
L start timer
NextSeqNum = InitialSeqNum wait
SendBase = InitialSeqNum for
event timeout
retransmit not-yet-acked segment
with smallest seq. #
start timer
ACK received, with ACK field value y
if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */
if (there are currently not-yet-acked segments)
start timer
else stop timer
}
© Department of Networked Systems and Services 67
TCP: retransmission scenarios
Host A Host B Host A Host B

SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

timeout
ACK=100
X
ACK=100
ACK=120

Seq=92, 8 bytes of data Seq=92, 8


SendBase=100 bytes of data
SendBase=120
ACK=100
ACK=120

SendBase=120

lost ACK scenario premature timeout

© Department of Networked Systems and Services 68


TCP: retransmission scenarios
Host A Host B

Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

ACK=100
X
ACK=120

Seq=120, 15 bytes of data

cumulative ACK

© Department of Networked Systems and Services 69


TCP ACK generation
[RFC 1122, RFC 2581]

event at receiver TCP receiver action


arrival of in-order segment with delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

arrival of in-order segment with immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

© Department of Networked Systems and Services 70


TCP fast retransmit

▪ time-out period often


relatively long: TCP fast retransmit
• long delay before if sender receives 3
resending lost packet ACKs for same data
▪ detect lost segments (“triple
(“triple duplicate
duplicate ACKs”),
ACKs”),
via duplicate ACKs. resend unacked
segment with smallest
• sender often sends
many segments back-
seq #
to-back ▪ likely that unacked
segment lost, so don’t
• if segment is lost, there wait for timeout
will likely be many
duplicate ACKs.

© Department of Networked Systems and Services 71


TCP fast retransmit
Host A Host B

Seq=92, 8 bytes of data


Seq=100, 20 bytes of data
X

ACK=100
timeout

ACK=100
ACK=100
ACK=100
Seq=100, 20 bytes of data

fast retransmit after sender


receipt of triple duplicate ACK
© Department of Networked Systems and Services 72
Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 73
TCP flow control
application
application may process
remove data from application
TCP socket buffers ….
TCP socket OS
receiver buffers
… slower than TCP
receiver is delivering
(sender is sending) TCP
code

IP
flow control code
receiver controls sender, so
sender won’t overflow
receiver’s buffer by transmitting from sender
too much, too fast
receiver protocol stack

© Department of Networked Systems and Services 74


TCP flow control

• receiver “advertises” free


buffer space by including to application process
rwnd value in TCP header
of receiver-to-sender
segments RcvBuffer buffered data
– RcvBuffer size set via
socket options (typical rwnd free buffer space
default is 4096 bytes)
– many operating systems
autoadjust RcvBuffer
TCP segment payloads
• sender limits amount of
unacked (“in-flight”) data to
receiver’s rwnd value receiver-side buffering
• guarantees receive buffer
will not overflow

© Department of Networked Systems and Services 75


Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 76
Connection Management
before exchanging data, sender/receiver “handshake”:
• agree to establish connection (each knowing the other willing
to establish connection)
• agree on connection parameters

application application

connection state: ESTAB connection state: ESTAB


connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
at server,client at server,client

network network

Socket clientSocket = Socket connectionSocket =


newSocket("hostname","port welcomeSocket.accept();
number");

© Department of Networked Systems and Services 77


Agreeing to establish a
connection
2-way handshake:
Q: will 2-way
handshake always
Let’s talk work in network?
ESTAB
ESTAB
OK • variable delays
• retransmitted messages
(e.g. req_conn(x)) due to
message loss
choose x • message reordering
req_conn(x)
ESTAB
acc_conn(x) • can’t “see” other side
ESTAB

© Department of Networked Systems and Services 78


Agreeing to establish a
connection
2-way handshake failure scenarios:

choose x choose x
req_conn(x) req_conn(x)
ESTAB ESTAB
retransmit acc_conn(x) retransmit acc_conn(x)
req_conn(x) req_conn(x)

ESTAB ESTAB
data(x+1) accept
req_conn(x)
retransmit data(x+1)
data(x+1)
connection connection
client x completes server x completes server
client
terminates forgets x terminates forgets x
req_conn(x)

ESTAB ESTAB
data(x+1) accept
half open connection! data(x+1)
(no client!)
© Department of Networked Systems and Services 79
TCP 3-way handshake

client state server state


LISTEN LISTEN
choose init seq num, x
send TCP SYN msg
SYNSENT SYNbit=1, Seq=x
choose init seq num, y
send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
received SYNACK(x)
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1, ACKnum=y+1
client-to-server data
received ACK(y)
indicates client is live
ESTAB

© Department of Networked Systems and Services 80


TCP 3-way handshake:
FSM
closed

Socket connectionSocket =
welcomeSocket.accept();

L Socket clientSocket =
SYN(x) newSocket("hostname","port
number");
SYNACK(seq=y,ACKnum=x+1)
create new socket for SYN(seq=x)
communication back to client listen

SYN SYN
rcvd sent

SYNACK(seq=y,ACKnum=x+1)
ESTAB ACK(ACKnum=y+1)
ACK(ACKnum=y+1)
L

© Department of Networked Systems and Services 81


TCP: closing a connection

▪ client, server each close their side of


connection
• send TCP segment with FIN bit = 1
▪ respond to received FIN with ACK
• on receiving FIN, ACK can be combined with own
FIN
▪ simultaneous FIN exchanges can be handled

© Department of Networked Systems and Services 82


TCP: closing a connection

client state server state


ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1
can still
FIN_WAIT_2 wait for server send data
close

LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime

CLOSED

© Department of Networked Systems and Services 83


Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 84
Principles of congestion
control

congestion:
• informally: “too many sources sending too
much data too fast for network to handle”
• different from flow control!
• manifestations:
– lost packets (buffer overflow at routers)
– long delays (queueing in router buffers)
• a top-10 problem!

© Department of Networked Systems and Services 85


Causes/costs of congestion:
scenario 1
original data: lin throughput: lout
▪ two senders, two
receivers Host A

▪ one router, infinite unlimited shared


output link buffers
buffers
▪ output link capacity: R
▪ no retransmission Host B

R/2

delay
lout

lin R/2 lin R/2


▪ maximum per-connection ▪ large delays as arrival rate, lin,
throughput: R/2 approaches capacity
© Department of Networked Systems and Services 86
Causes/costs of congestion:
scenario 2
▪ one router, finite buffers
▪ sender retransmission of timed-out packet
• application-layer input = application-layer output: lin =
lout
• transport-layer input includes retransmissions : lin lin

lin : original data


lout
l'in: original data, plus
retransmitted data

Host A

finite shared output


Host B
link buffers
© Department of Networked Systems and Services 87
Causes/costs of congestion:
scenario 2
R/2
idealization: perfect
knowledge

lout
▪ sender sends only
when router buffers
available lin R/2

lin : original data


lout
copy l'in: original data, plus
retransmitted data

A free buffer space!

finite shared output


Host B
link buffers
© Department of Networked Systems and Services 88
Causes/costs of congestion:
scenario 2
Idealization: known
loss packets can be
lost, dropped at router
due to full buffers
▪ sender only resends if
packet known to be lost

lin : original data


lout
copy l'in: original data, plus
retransmitted data

A
no buffer space!

Host B

© Department of Networked Systems and Services 89


Causes/costs of congestion:
scenario 2
Idealization: known R/2
loss packets can be
lost, dropped at router when sending at R/2,
due to full buffers some packets are

lout
retransmissions but
▪ sender only resends if asymptotic goodput
packet known to be lost is still R/2 (why?)

lin R/2

lin : original data


lout
l'in: original data, plus
retransmitted data

A
free buffer space!

Host B

© Department of Networked Systems and Services 90


Causes/costs of congestion:
scenario 2
Realistic: duplicates R/2
▪ packets can be lost, dropped at
router due to full buffers when sending at R/2,
some packets are

lout
▪ sender times out prematurely, retransmissions

sending two copies, both of including duplicated


that are delivered!
which are delivered lin R/2

lin
timeout
copy l'in lout

A
free buffer space!

Host B

© Department of Networked Systems and Services 91


Causes/costs of congestion:
scenario 2
Realistic: duplicates R/2
▪ packets can be lost, dropped at
router due to full buffers when sending at R/2,
some packets are

lout
▪ sender times out prematurely, retransmissions

sending two copies, both of including duplicated


that are delivered!
which are delivered lin R/2

“costs” of congestion:
▪ more work (retrans) for given “goodput”
▪ unneeded retransmissions: link carries multiple copies of pkt
• decreasing goodput

© Department of Networked Systems and Services 92


Causes/costs of congestion:
scenario 3
Q: what happens as lin and lin’
▪ four senders increase ?
▪ multihop paths A: as red lin’ increases, all arriving
▪ timeout/retransmit blue pkts at upper queue are
dropped, blue throughput g 0
Host A
lin : original data lout
Host B
l'in: original data, plus
retransmitted data
finite shared output
link buffers

Host D
Host C

© Department of Networked Systems and Services 93


Causes/costs of congestion:
scenario 3
C/2
lout

lin’ C/2

another “cost” of congestion:


▪ when packet dropped, any “upstream
transmission capacity used for that packet was
wasted!

© Department of Networked Systems and Services 94


Chapter 3 outline

3.1 transport-layer 3.5 connection-oriented


services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection
management
3.4 principles of
3.6 principles of
reliable data
congestion control
transfer
3.7 TCP congestion
control
© Department of Networked Systems and Services 95
TCP congestion control: additive
increase multiplicative decrease

▪ approach: sender increases transmission rate (window


size), probing for usable bandwidth, until loss occurs
• additive increase: increase cwnd by 1 MSS every
RTT until loss detected
• multiplicative decrease: cut cwnd in half after loss
additively increase window size …
…. until loss occurs (then cut window in half)
congestion window size
cwnd: TCP sender

AIMD saw tooth


behavior: probing
for bandwidth

time
© Department of Networked Systems and Services 96
TCP Congestion Control: details

sender sequence number space


cwnd TCP sending rate:
▪ roughly: send cwnd
bytes, wait RTT for
ACKS, then send
last byte last byte more bytes
ACKed sent, not- sent
yet ACKed
(“in-
flight”) cwnd
rate ~
~ bytes/sec
▪ sender limits transmission: RTT
LastByteSent- < cwnd
LastByteAcked
▪ cwnd is dynamic, function of
perceived network congestion

© Department of Networked Systems and Services 97


TCP Slow Start
Host A Host B
▪ when connection
begins, increase rate
exponentially until first

RTT
loss event:
• initially cwnd = 1 MSS
• double cwnd every RTT
• done by incrementing
cwnd for every ACK
received
▪ summary: initial rate is time
slow but ramps up
exponentially fast
© Department of Networked Systems and Services 98
TCP: detecting, reacting to loss

▪ loss indicated by timeout:


• cwnd set to 1 MSS;
• window then grows exponentially (as in slow start)
to threshold, then grows linearly
▪ loss indicated by 3 duplicate ACKs: TCP RENO
• dup ACKs indicate network capable of delivering
some segments
• cwnd is cut in half window then grows linearly
▪ TCP Tahoe always sets cwnd to 1 (timeout or 3
duplicate acks)
© Department of Networked Systems and Services 99
TCP: switching from slow
start to CA
Q: when should the
exponential increase
switch to linear?
A: when cwnd gets to
1/2 of its value
before timeout.

Implementation:
▪ variable ssthresh
▪ on loss event, ssthresh
is set to 1/2 of cwnd just
before loss event
* Check out the online interactive exercises for more
examples: http://gaia.cs.umass.edu/kurose_ross/interactive/

© Department of Networked Systems and Services 100


TCP throughput

• avg. TCP thruput as function of window size, RTT?


– ignore slow start, assume always data to send
• W: window size (measured in bytes) where loss occurs
– avg. window size (# in-flight bytes) is ¾ W
– avg. thruput is 3/4W per RTT
3 W
avg TCP thruput = bytes/sec
4 RTT

W/2

© Department of Networked Systems and Services 101


TCP Fairness

fairness goal: if K TCP sessions share same


bottleneck link of bandwidth R, each should
have average rate of R/K
TCP connection 1

bottleneck
router
capacity R
TCP connection 2

© Department of Networked Systems and Services 102


Why is TCP fair?

two competing sessions:


▪ additive increase gives slope of 1, as throughout increases
▪ multiplicative decrease decreases throughput proportionally
R equal bandwidth share

loss: decrease window by factor of 2


congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase

Connection 1 throughput R

© Department of Networked Systems and Services 103


Fairness (more)

Fairness and UDP Fairness, parallel TCP


▪ multimedia apps connections
often do not use ▪ application can open
TCP multiple parallel
connections between
• do not want rate two hosts
throttled by
congestion control ▪ web browsers do this
▪ instead use UDP: ▪ e.g., link of rate R
with 9 existing
• send audio/video at
connections:
constant rate,
• new app asks for 1 TCP,
tolerate packet loss gets rate R/10
• new app asks for 11 TCPs,
gets R/2
© Department of Networked Systems and Services 104
Explicit Congestion
Notification (ECN)
network-assisted congestion control:
▪ two bits in IP header (ToS field) marked by network
router to indicate congestion
▪ congestion indication carried to receiving host
▪ receiver (seeing congestion indication in IP datagram) )
sets ECE bit on receiver-to-sender ACK segment to
notify sender of congestion

TCP ACK segment


source destination
application application
ECE=1
transport transport
network network
link link
physical physical

ECN=00 ECN=11

IP datagram
© Department of Networked Systems and Services 105
Chapter 3: summary
▪ principles behind transport
layer services:
• multiplexing, next:
demultiplexing • leaving the
• reliable data transfer network “edge”
• flow control (application,
• congestion control transport layers)
▪ instantiation, implementation • into the network
in the Internet “core”
• UDP
• TCP • two network layer
chapters:
– data plane
– control plane

© Department of Networked Systems and Services 106


www.hit.bme.hu

BMEVIHIAB01-COMNET

Adrian Pekar
[email protected]

© Department of Networked Systems and Services 1


www.hit.bme.hu

https://fiberbit.com.tw/tcpip-model-vs-osi-model/

© Department of Networked Systems and Services 2


www.hit.bme.hu

https://networkencyclopedia.com/encapsulation/

© Department of Networked Systems and Services 3


www.hit.bme.hu

https://encyclopedia2.thefreedictionary.com/Network+stack

© Department of Networked Systems and Services 4


www.hit.bme.hu

https://parkerhill.me/optimizing/

© Department of Networked Systems and Services 5


www.hit.bme.hu

Chapter 4
Network Layer:
The Data Plane
A note on the use of these Powerpoint slides:
We’re making these slides freely available to all (faculty, students, readers).
They’re in PowerPoint form so you see the animations; and can add, modify,
and delete slides (including this one) and slide content to suit your needs.
They obviously represent a lot of work on our part. In return for use, we only
ask the following: Computer
§ If you use these slides (e.g., in a class) that you mention their source
(after all, we’d like people to use our book!)
Networking: A Top
§ If you post any slides on a www site, that you note that they are adapted
from (or perhaps identical to) our slides, and note our copyright of this Down Approach
material.
7th edition
Thanks and enjoy! JFK/KWR
Jim Kurose, Keith Ross
All material copyright 1996-2016 Pearson/Addison Wesley
J.F Kurose and K.W. Ross, All Rights Reserved April 2016

© Department of Networked Systems and Services 6


www.hit.bme.hu

Network layer overview

© Department of Networked Systems and Services 7


Network layer
www.hit.bme.hu

application
• transport segment from transport
network

sending to receiving host data link


physical
network
network

• on sending side network


data link
data link
physical
data link
physical

encapsulates segments physical network


data link
network
data link
into datagrams physical physical

• on receiving side, delivers network


data link
network
data link

segments to transport physical


network
data link
physical

layer physical
application
network transport
• network layer protocols in network
data link
physical
network
data link
network
data link
every host, router data link
physical
physical physical

• router examines header


fields in all IP datagrams
passing through it
© Department of Networked Systems and Services 8
www.hit.bme.hu

https://ccnacompletecourse.blogspot.com/2019/08/data-encapsulation-in-computer-networks.html

© Department of Networked Systems and Services 9


Two key network-layer
www.hit.bme.hu
functions

network-layer functions: analogy: taking a trip


•forwarding: move § forwarding: process of
packets from router’s getting through single
input to appropriate router interchange
output
•routing: determine route § routing: process of
taken by packets from planning trip from source
source to destination to destination
– routing algorithms

© Department of Networked Systems and Services 10


Network layer: data plane,
www.hit.bme.hu
control plane

Data plane Control plane


§ local, per-router function
§ network-wide logic
§ determines how datagram
§ determines how datagram is
routed among routers along
arriving on router input
end-end path from source
port is forwarded to router host to destination host
output port
§ two control-plane
§ forwarding function approaches:
values in arriving
packet header • traditional routing algorithms:
implemented in routers
0111 1
2
• software-defined networking
3 (SDN): implemented in
(remote) servers

© Department of Networked Systems and Services 11


Per-router control plane
www.hit.bme.hu

Individual routing algorithm components in each and every


router interact in the control plane

4.1 • OVERVIEW OF NETWORK LAYER 309

Routing
Algorithm
Routing algorithm control
Control plane plane
Data plane

Local forwarding data


table
header output
plane
0100 3
0110 2
0111 2
1001 1

Values in arriving
values in arriving
packet’s header
1
packet header 1101

2
3
0111 1
2
3

Figure 4.2 ♦ Routing algorithms determine values in forward tables

© Department
tables. of Networked
In this example, Systems and Services
a routing algorithm runs in each and every router and both 12
Logically centralized control
www.hit.bme.hu
plane
A distinct (typically remote) controller interacts with local
control agents (CAs)

Remote Controller

control
plane

data
plane

CA
CA CA CA CA
values in arriving
packet header

0111 1
2
3

© Department of Networked Systems and Services 13


TRADITIONAL NETWORKING VS
SDN
www.hit.bme.hu

https://www.sdxcentral.com/5g/definitions/5g-sdn/

© Department of Networked Systems and Services 14


www.hit.bme.hu

What is inside a router?

© Department of Networked Systems and Services 15


Router architecture
www.hit.bme.hu
overview
• high-level view of generic router
architecture: routing, management
routing control plane (software)
processor operates in millisecond
time frame
forwarding data plane
(hardware) operttes in
nanosecond
timeframe
high-seed
switching
fabric

router input ports router output ports

© Department of Networked Systems and Services 16


Input port functions
www.hit.bme.hu

lookup,
link forwarding
line layer switch
termination protocol fabric
(receive)
queueing

physical layer:
bit-level reception
data link layer: decentralized switching:
e.g., Ethernet • using header field values, lookup output
port using forwarding table in input port
memory (“match plus action”)
• goal: complete input port processing at
‘line speed’
• queuing: if datagrams arrive faster than
forwarding rate into switch fabric
© Department of Networked Systems and Services 17
Destination-based
www.hit.bme.hu
forwarding
destination-based forwarding: forward based only on
destination IP address (traditional) forwarding table
Destination Address Range Link Interface

11001000 00010111 00010000 00000000


through 0
11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000


1
through
11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000 2


through
11001000 00010111 00011111 11111111
otherwise 3
Q: but what happens if ranges don’t divide up so nicely?
© Department of Networked Systems and Services 18
Longest prefix matching
www.hit.bme.hu

longest prefix matching


when looking for forwarding table entry for given
destination address, use longest address prefix that
matches destination address.

Destination Address Range Link interface


11001000 00010111 00010*** ********* 0

11001000 00010111 00011000 ********* 1

11001000 00010111 00011*** ********* 2


3
otherwise

examples:
DA: 11001000 00010111 00010110 10100001 which interface?
DA: 11001000 00010111 00011000 10101010 which interface?
© Department of Networked Systems and Services 19
Actions Summary
www.hit.bme.hu

Although “lookup” is arguably the most important


action in input port processing, many other actions
must be taken:
1) physical- and link-layer processing must occur,
as discussed previously;
2) the packet’s version number, checksum and
time-to-live field must be checked, and the latter
two fields rewritten; and
3) counters used for network management (such
as the number of IP datagrams received) must
be updated.
© Department of Networked Systems and Services 20
Switching fabrics
www.hit.bme.hu

§ transfer packet from input buffer to


appropriate output buffer
§ switching rate: rate at which packets can
be transfer from inputs to outputs
• often measured as multiple of input/output line rate
• N inputs: switching rate N times line rate desirable
§ three types of switching fabrics

memory

memory bus crossbar

© Department of Networked Systems and Services 21


OUTPUT PORTS
www.hit.bme.hu

datagram
switch buffer link
fabric layer line
protocol termination
queueing (send)

physical layer:
decentralized switching: bit-level reception
data link layer:
• goal: complete output port
e.g., Ethernet
processing at ‘line speed’
• queuing: when arrival at
output port exceeds output
line speed

© Department of Networked Systems and Services 22


www.hit.bme.hu

Queuing

© Department of Networked Systems and Services 23


Input port queuing
www.hit.bme.hu

• fabric slower than input ports combined -> queueing may


occur at input queues
– queueing delay and loss due to input buffer overflow!
• Head-of-the-Line (HOL) blocking: queued datagram at front
of queue prevents others in queue from moving forward

switch switch
fabric fabric

output port contention: one packet time later:


only one red datagram can be green packet
transferred. experiences HOL
lower red packet is blocked blocking

© Department of Networked Systems and Services 24


Output port queueing
www.hit.bme.hu

switch
switch
fabric
fabric

at t, packets more one packet time later


from input to output

• buffering when arrival rate via switch exceeds


output line speed
• queueing (delay) and loss due to output port
buffer overflow!

© Department of Networked Systems and Services 25


Scheduling mechanisms
www.hit.bme.hu

• scheduling: choose next packet to send on link


• FIFO (first in first out)/FCFS (first come first
served) scheduling: send in order of arrival to
queue
– discard policy: if packet arrives to full queue: who to discard?
• tail drop: drop arriving packet
• priority: drop/remove on priority basis
• random: drop/remove randomly

packet packet
arrivals queue link departures
(waiting area) (server)

© Department of Networked Systems and Services 26


Scheduling policies: priority
www.hit.bme.hu

priority scheduling:
• multiple classes,
with different
priorities high priority queue
(waiting area)

• send highest
priority queued arrivals departures

packet
• class may depend on classify link

marking or other low priority queue


(server)

header info, e.g., IP (waiting area)

src/dst, port
numbers, etc.

© Department of Networked Systems and Services 27


Priority queue in operation
www.hit.bme.hu

Under a non-preemptive priority queuing discipline, the transmission of a


packet is not interrupted once it has begun.

© Department of Networked Systems and Services 28


Two-class robin queue in
www.hit.bme.hu
operation

• cyclically scan class queues, sending one complete packet


from each class (if available)
• Work-conserving queuing – will never allow the link to
remain idle whenever there are packets (of any class) queued
for transmission.
© Department of Networked Systems and Services 29
www.hit.bme.hu

Internet Protocol

© Department of Networked Systems and Services 30


The Internet network layer
www.hit.bme.hu

host, router network layer functions:

transport layer: TCP, UDP

routing protocols IP protocol


• path selection • addressing conventions
• RIP, OSPF, BGP • datagram format
network • packet handling conventions
layer forwarding
table
ICMP protocol
• error reporting
• router “signaling”

link layer

physical layer

© Department of Networked Systems and Services 31


IP datagram format
www.hit.bme.hu
IP protocol version
number 32 bits total datagram
header length length (bytes)
head. type of
(bytes) ver length
len service for
“type” of data fragment fragmentation/
16-bit identifier flgs
offset reassembly
max number time to upper header
remaining hops live layer checksum
(decremented at
each router) 32 bit source IP address

upper layer protocol 32 bit destination IP address


to deliver payload to e.g. timestamp,
options (if any)
record route
how much overhead? data taken, specify
(variable length, list of routers
v 20 bytes of TCP
typically a TCP to visit.
v 20 bytes of IP
or UDP segment)
v = 40 bytes + app
layer overhead

© Department of Networked Systems and Services 32


www.hit.bme.hu

IP fragmentation

© Department of Networked Systems and Services 33


IP fragmentation, reassembly
www.hit.bme.hu

• network links have MTU


(max.transfer size) -
largest possible link-level
fragmentation:
frame


in: one large datagram
– different link types, out: 3 smaller datagrams
different MTUs
• large IP datagram divided
(“fragmented”) within net
reassembly
– one datagram becomes
several datagrams
– “reassembled” only at …
final destination
– IP header bits used to
identify, order related
fragments
© Department of Networked Systems and Services 34
IP fragmentation, reassembly
www.hit.bme.hu

length ID fragflag offset


example: =4000 =x =0 =0
v 4000 byte datagram
one large datagram becomes
v MTU = 1500 bytes several smaller datagrams

1480 bytes in length ID fragflag offset


data field =1500 =x =1 =0

offset = length ID fragflag offset


1480/8 =1500 =x =1 =185

offset = length ID fragflag offset


2960/8 =1040 =x =0 =370

© Department of Networked Systems and Services 35


www.hit.bme.hu

IP addressing

© Department of Networked Systems and Services 36


Classful Addressing
www.hit.bme.hu

• Addresses were assigned on 8-bit boundaries.


• So, we ended up with a few very large blocks,
some mid-sized blocks (class B addresses) and
many small blocks (class C).
• In addition to that, there was a range of
addresses (class E) that were pretty much
unusable because it was reserved for research
and development and many operators today
filter packets sent from those addresses.

https://labs.ripe.net/Members/olafur_gudmundsson/what-do-we-know-about-an-ip-address

© Department of Networked Systems and Services 37


IP addressing: introduction
www.hit.bme.hu

223.1.1.1

• IP address: 32-bit 223.1.2.1


identifier for host, router
interface 223.1.1.2
223.1.1.4 223.1.2.9

• interface: connection
between host/router and 223.1.1.3
223.1.3.27

physical link 223.1.2.2

– router’s typically have


multiple interfaces
223.1.3.1 223.1.3.2
– host typically has one or
two interfaces (e.g., wired
Ethernet, wireless 802.11)
• IP addresses associated 223.1.1.1 = 11011111 00000001 00000001 00000001
with each interface 223 1 1 1

© Department of Networked Systems and Services 38


Binary IPv4 to dotted
www.hit.bme.hu
decimal

http://www.highteck.net/EN/Network/Addressing_the_N
etwork-IPv4.html

© Department of Networked Systems and Services 39


IP addressing: introduction
www.hit.bme.hu

223.1.1.1

Q: how are interfaces 223.1.2.1

actually connected? 223.1.1.2


223.1.1.4 223.1.2.9

223.1.3.27
223.1.1.3
223.1.2.2

A: wired Ethernet interfaces


connected by Ethernet switches
223.1.3.1 223.1.3.2

For now: don’t need to worry


about how one interface is
connected to another (with no
A: wireless WiFi interfaces
intervening router)
connected by WiFi base station

© Department of Networked Systems and Services 40


Subnets
www.hit.bme.hu

• what’s a subnet ? 223.1.1.1

–device interfaces with


223.1.1.2 223.1.2.1
same subnet part of IP 223.1.1.4 223.1.2.9
address
223.1.2.2
–can physically reach 223.1.1.3 223.1.3.27
each other without subnet
intervening router
223.1.3.1 223.1.3.2

network consisting of 3 subnets

© Department of Networked Systems and Services 41


Subnets
www.hit.bme.hu
223.1.1.0/24
223.1.2.0/24
223.1.1.1
recipe
§ to determine the 223.1.1.2
223.1.1.4 223.1.2.9
223.1.2.1

subnets, detach
223.1.2.2
each interface from 223.1.1.3 223.1.3.27
its host or router, subnet
creating islands of
223.1.3.2
isolated networks 223.1.3.1

§ each isolated
network is called a 223.1.3.0/24
subnet
subnet mask: /24

© Department of Networked Systems and Services 42


223.1.1.2
Subnets
www.hit.bme.hu

223.1.1.1
how many? 223.1.1.4

223.1.1.3
6
223.1.9.2 223.1.7.0

223.1.9.1 223.1.7.1
223.1.8.1 223.1.8.0

223.1.2.6 223.1.3.27

223.1.2.1 223.1.2.2 223.1.3.1 223.1.3.2

© Department of Networked Systems and Services 43


IPv4 subnet mask
www.hit.bme.hu

• An IP address is a hierarchical address that is made up of two parts: a network portion and
a host portion.
• The bits within the network portion of the address must be identical for all devices that
reside in the same network.

• The subnet mask is compared to the IP address from left to right, bit for bit.
• The subnet mask is represented in dotted decimal format for ease of use.
• The subnet mask is configured on a host device, in conjunction with the IPv4 address, and
is required so the host can determine which network it belongs to.

http://sclabs.blogspot.com/
© Department of Networked Systems and Services 44
Classful vs Classless
www.hit.bme.hu
Addressing

© Department of Networked Systems and Services 45


IP addressing: CIDR
www.hit.bme.hu

CIDR: Classless InterDomain Routing


• subnet portion of address of arbitrary length
• address format: a.b.c.d/x, where x is # bits in
subnet portion of address

subnet host
part part
11001000 00010111 00010000 00000000
200.23.16.0/23

© Department of Networked Systems and Services 46


Assigning Addresses
www.hit.bme.hu

http://www.highteck.net/EN/Network/Addressing_the_N
etwork-IPv4.html

© Department of Networked Systems and Services 47


www.hit.bme.hu

How to get an IP address?

© Department of Networked Systems and Services 48


IP addresses: how to get
www.hit.bme.hu
one?

Q: How does a host get IP address?

• hard-coded by system admin in a file


• DHCP: Dynamic Host Configuration
Protocol: dynamically get address from
as server
– “plug-and-play”

© Department of Networked Systems and Services 49


DHCP: Dynamic Host
www.hit.bme.hu
Configuration Protocol

goal: allow host to dynamically obtain its IP address from


network server when it joins network
– can renew its lease on address in use
– allows reuse of addresses (only hold address while
connected/“on”)
– support for mobile users who want to join network

© Department of Networked Systems and Services 50


DHCP client-server scenario
www.hit.bme.hu

DHCP
223.1.1.0/24
server
223.1.1.1 223.1.2.1

223.1.1.2 arriving DHCP


223.1.1.4 223.1.2.9
client needs
address in this
223.1.1.3 223.1.3.27
223.1.2.2 network

223.1.2.0/24

223.1.3.1 223.1.3.2

223.1.3.0/24

© Department of Networked Systems and Services 51


DHCP client-server scenario
DHCP server: 223.1.2.5 DHCP discover arriving
client
src : 0.0.0.0, 68
Broadcast: is there a
dest.: 255.255.255.255,67
DHCPyiaddr:
server 0.0.0.0
out there?
transaction ID: 654

DHCP offer
src: 223.1.2.5, 67
Broadcast: I’m a DHCP
dest: 255.255.255.255, 68
server! Here’s an IP
yiaddrr: 223.1.2.4
transaction
address youID:can
654 use
lifetime: 3600 secs
DHCP request
src: 0.0.0.0, 68
dest:: 255.255.255.255, 67
Broadcast: OK. I’ll take
yiaddrr: 223.1.2.4
that IP address!
transaction ID: 655
lifetime: 3600 secs

DHCP ACK
src: 223.1.2.5, 67
dest: 255.255.255.255,
Broadcast: 68
OK. You’ve
yiaddrr: 223.1.2.4
got that IPID:
transaction address!
655
lifetime: 3600 secs

yiaddr = “your Internet address”


DHCP: more than IP
www.hit.bme.hu
addresses
DHCP can return more than just
allocated IP address on subnet:
• address of first-hop router for client
• name and IP address of DNS sever
• network mask (indicating network versus
host portion of address)

© Department of Networked Systems and Services 53


DHCP: example
www.hit.bme.hu

DHCP DHCP § connecting laptop needs


DHCP UDP its IP address, addr of
DHCP IP first-hop router, addr of
DHCP Eth
Phy DNS server: use DHCP
§ DHCP request encapsulated
DHCP

in UDP, encapsulated in IP,


DHCP DHCP 168.1.1.1 encapsulated in 802.1
DHCP UDP Ethernet
DHCP IP
DHCP Eth router with DHCP
§ Ethernet frame broadcast
Phy server built into (dest: FFFFFFFFFFFF) on LAN,
router received at router running
DHCP server
§ Ethernet demuxed to IP
demuxed, UDP demuxed to
DHCP

© Department of Networked Systems and Services 54


DHCP: example
www.hit.bme.hu

DHCP DHCP • DCP server formulates


DHCP UDP DHCP ACK containing
DHCP IP client’s IP address, IP
DHCP Eth
address of first-hop router
Phy
for client, name & IP
address of DNS server

§ encapsulation of DHCP
DHCP DHCP server, frame forwarded
DHCP UDP to client, demuxing up to
DHCP IP DHCP at client
DHCP Eth router with DHCP
DHCP
Phy server built into § client now knows its IP
router address, name and IP
address of DSN server, IP
address of its first-hop
router

© Department of Networked Systems and Services 55


IP addresses: how to get one?
www.hit.bme.hu

Q: how does network get subnet part of IP


addr?
A: gets allocated portion of its provider
ISP’s address space
ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20

Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23


Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23
Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23
... ….. …. ….
Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23

© Department of Networked Systems and Services 56


Hierarchical addressing: route
www.hit.bme.hu
aggregation
hierarchical addressing allows efficient advertisement of routing
information:

Organization 0
200.23.16.0/23
Organization 1
“Send me anything
200.23.18.0/23 with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
beginning
199.31.0.0/16”

© Department of Networked Systems and Services 57


Hierarchical addressing: more
www.hit.bme.hu
specific routes

ISPs-R-Us has a more specific route to Organization 1

Organization 0
200.23.16.0/23

“Send me anything
with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us Longest
with addresses
beginning 199.31.0.0/16 prefix match
Organization 1 determines
(255.255.0.0)
200.23.18.0/23 or 200.23.18.0/23 the route
(255.255.254.0)”
© Department of Networked Systems and Services 58
www.hit.bme.hu

Network Address Translation

© Department of Networked Systems and Services 59


What is public IP
www.hit.bme.hu
address?
• A public IP address is the address that is assigned
to a computing device to allow direct access over
the Internet.
• A web server, email server and any server device
directly accessible from the Internet are candidate
for a public IP address.
• A public IP address is globally unique and can only
be assigned to a unique device.
• Internet Assigned Numbers Authority (IANA) is the
organization responsible for registering IP address
ranges to organizations and Internet Service
Providers (ISPs).

© Department of Networked Systems and Services 60


What is private IP
www.hit.bme.hu
address?
• A private IP address is the address space
allocated by InterNIC to allow organizations to
create their own private network.
• There are three IP blocks (1 class A, 1 class B and
1 class C) reserved for a private use.
• The computers, tablets and smartphones sitting
behind your home, and the personal computers
within an organizations are usually assigned
private IP addresses.
• A network printer residing in your home is
assigned a private address so that only your family
can print to your local printer.

© Department of Networked Systems and Services 61


Public vs Private
www.hit.bme.hu

https://hbeducationservices.blogspot.com/2018/11/publi
c-and-private-ip-addresses.html
© Department of Networked Systems and Services 62
NAT: network address
www.hit.bme.hu
translation

rest of local network


Internet (e.g., home network)
10.0.0/24 10.0.0.1

10.0.0.4
10.0.0.2
138.76.29.7

10.0.0.3

all datagrams leaving local datagrams with source or


network have same single destination in this network
source NAT IP address: have 10.0.0/24 address for
138.76.29.7,different source source, destination (as usual)
port numbers
© Department of Networked Systems and Services 63
NAT: network address
www.hit.bme.hu
translation

motivation: local network uses just one IP


address as far as outside world is concerned:
§ range of addresses not needed from ISP: just one IP
address for all devices
§ can change addresses of devices in local network
without notifying outside world
§ can change ISP without changing addresses of devices in
local network
§ devices inside local net not explicitly addressable, visible
by outside world (a security plus)

© Department of Networked Systems and Services 64


NAT: network address
www.hit.bme.hu
translation
implementation: NAT router must:
§ outgoing datagrams: replace (source IP address, port #) of
every outgoing datagram to (NAT IP address, new port
#)
. . . remote clients/servers will respond using (NAT IP address,
new port #) as destination addr

§ remember (in NAT translation table) every (source IP


address, port #) to (NAT IP address, new port #)
translation pair
§ incoming datagrams: replace (NAT IP address, new port #)
in dest fields of every incoming datagram with
corresponding (source IP address, port #) stored in NAT
table
© Department of Networked Systems and Services 65
NAT: network address
www.hit.bme.hu
translation
NAT translation table 1: host 10.0.0.1
2: NAT router WAN side addr LAN side addr
changes datagram sends datagram to
source addr from 138.76.29.7, 5001 10.0.0.1, 3345 128.119.40.186, 80
10.0.0.1, 3345 to …… ……
138.76.29.7, 5001,
updates table S: 10.0.0.1, 3345
D: 128.119.40.186, 80
10.0.0.1
1
S: 138.76.29.7, 5001
2 D: 128.119.40.186, 80 10.0.0.4
10.0.0.2
138.76.29.7 S: 128.119.40.186, 80
D: 10.0.0.1, 3345 4
S: 128.119.40.186, 80
D: 138.76.29.7, 5001 3 10.0.0.3
4: NAT router
3: reply arrives changes datagram
dest. address: dest addr from
138.76.29.7, 5001 138.76.29.7, 5001 to 10.0.0.1, 3345
* Check out the online interactive exercises for more
examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
© Department of Networked Systems and Services 66
NAT: network address
www.hit.bme.hu
translation
• 16-bit port-number field:
– 60,000 simultaneous connections with a single
LAN-side address!
• NAT is controversial:
– routers should only process up to layer 3
– address shortage should be solved by IPv6
– violates end-to-end argument
• NAT possibility must be taken into account by app
designers, e.g., P2P applications
– NAT traversal: what if client wants to connect to
server behind NAT?

© Department of Networked Systems and Services 67


www.hit.bme.hu

IPv6

© Department of Networked Systems and Services 68


IPv6: motivation
www.hit.bme.hu

• initial motivation: 32-bit address space


soon to be completely allocated.
• additional motivation:
– header format helps speed processing/forwarding
– header changes to facilitate QoS

IPv6 datagram format:


– fixed-length 40 byte header
– no fragmentation allowed
© Department of Networked Systems and Services 69
IPv6 datagram format
www.hit.bme.hu

priority: identify priority among datagrams in flow


flow Label: identify datagrams in same “flow.”
(concept of“flow” not well defined).
next header: identify upper layer protocol for data
ver pri flow label
payload len next hdr hop limit
source address
(128 bits)
destination address
(128 bits)

data

32 bits
© Department of Networked Systems and Services 70
Other changes from IPv4
www.hit.bme.hu

• checksum: removed entirely to reduce


processing time at each hop
• options: allowed, but outside of
header, indicated by “Next Header”
field
• ICMPv6: new version of ICMP
– additional message types, e.g. “Packet Too
Big”
– multicast group management functions

© Department of Networked Systems and Services 71


Transition from IPv4 to IPv6
www.hit.bme.hu

• not all routers can be upgraded


simultaneously
– no “flag days”
– how will network operate with mixed IPv4 and IPv6
routers?
• tunneling: IPv6 datagram carried as payload
in IPv4 datagram among IPv4 routers

IPv4 header fields IPv6 header fields


IPv4 payload
IPv4 source, dest addr IPv6 source dest addr
UDP/TCP payload

IPv6 datagram
IPv4 datagram

© Department of Networked Systems and Services 72


Tunneling
www.hit.bme.hu

A B IPv4 tunnel E F
connecting IPv6 routers
logical view:
IPv6 IPv6 IPv6 IPv6

A B C D E F
physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6

© Department of Networked Systems and Services 73


Tunneling
www.hit.bme.hu

A B IPv4 tunnel E F
connecting IPv6 routers
logical view:
IPv6 IPv6 IPv6 IPv6

A B C D E F
physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6

flow: X src:B src:B flow: X


src: A dest: E src: A
dest: F
dest: E
dest: F
Flow: X Flow: X
Src: A Src: A
data Dest: F Dest: F data

data data

A-to-B: E-to-F:
IPv6 B-to-C: B-to-C: IPv6
IPv6 inside IPv6 inside
IPv4 IPv4
© Department of Networked Systems and Services 74

You might also like