0% found this document useful (0 votes)
33 views219 pages

Computer Networks (CS F303) : Pilani

The document outlines the Computer Networks (CS F303) course at BITS Pilani, detailing the course objectives, structure, and administration. It covers fundamental concepts of networking, including network architecture, protocols, and the internet's structure, as well as practical applications and evaluation methods. The course aims to provide hands-on experience with network applications and an understanding of various networking principles and technologies.

Uploaded by

vibhorbarguje2nd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views219 pages

Computer Networks (CS F303) : Pilani

The document outlines the Computer Networks (CS F303) course at BITS Pilani, detailing the course objectives, structure, and administration. It covers fundamental concepts of networking, including network architecture, protocols, and the internet's structure, as well as practical applications and evaluation methods. The course aims to provide hands-on experience with network applications and an understanding of various networking principles and technologies.

Uploaded by

vibhorbarguje2nd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 219

Computer Networks (CS F303)

BITS Pilani Virendra Singh Shekhawat


Department of Computer Science and Information Systems
Pilani Campus
Today’s Agenda

• Course Overview
• Course Administration
• What is network?
• What is Internet?
• Network Structure
– Edge, Access Network (Physical Media), Network Core
• Internet Structure

2
Computer Networks CS F303 BITS Pilani, Pilani Campus
Course Objective

• To get familiar with the principles and working of state-


of-the-art of networking
– Routing, forwarding, data transport, addressing, naming,
congestion control, reliability, security etc.
– Design of network and services
• Learn how communication networks are put together
–Mechanisms, Algorithms, Technology components
• To understand network internals in a hands-on way
– Writing simple network applications, understanding and
analyzing working principles of various network protocols
3
Computer Networks CS F303 BITS Pilani, Pilani Campus
Course Overview

• Internet Architecture and Computer Network Primitives


• Network Applications (Application Layer)
• End to End Data Transfer (Transport Layer)
• Data Routing and Forwarding (Network Layer)
• Access Networks & LANs (Link Layer)
• Communication Channels (Physical Layer)
• Wireless and Mobile Networks

4
Computer Networks CS F303 BITS Pilani, Pilani Campus
Course Administration and Text Book(s)
• Instruction delivery
– Lecture classes
• 11:00 – 11:50 AM [T Th F]
– Lab classes
• Starting from 20th Jan 2025
• Course page Information
– NALANDA LMS
• Evaluation Plan
– Mid Semester Test @25%
– Quiz @15% [22 Feb 2025]
– Lab Exam@20% [20 Apr 2025]
– Comprehensive exam @40%
5
Computer Networks CS F303 BITS Pilani, Pilani Campus
What is a Network?

• An infrastructure (shared) that allows users (distributed) to


communicate with each other
– People, devices, …
– By means of voice, video, text, …
– For example, Telephone n/w, Cable TV Network, Satellite network,
military n/w etc. …

• Basic building blocks are


– Nodes (Hosts and Forwarding devices) and Links
6

Computer Networks CS F303 BITS Pilani, Pilani Campus


The Internet: a “nuts and bolts” view
Billions of connected mobile network
computing devices: national or global ISP
 hosts = end systems
 running network apps at
Internet’s “edge”

Packet switches: forward


local or
packets (chunks of data) Internet
regional ISP
 routers, switches
home network content
Communication links provider
network datacenter
 fiber, copper, radio, satellite network

 transmission rate: bandwidth


Networks enterprise
 collection of devices, routers, network
links: managed by an organization 7
Computer Networks CS F303 BITS Pilani, Pilani Campus
The Internet: : “nuts and bolts” view
• Internet: “network of networks” mobile network
4G
national or global ISP
• Protocols are everywhere
– Control sending, receiving of messages
Streaming
– e.g., HTTP (Web), streaming video, Skype, Skype
IP
video

TCP, IP, WiFi, Ethernet local or


regional ISP
protocols define format, order of msgs sent and
received among network entities, and actions home network content
provider
taken on msg transmission, receipt HTTP network datacenter
network
Ethernet
 Internet standards
TCP
• RFC: Request for Comments enterprise
• IETF: Internet Engineering Task Force network

WiFi 8
Computer Networks CS F303 BITS Pilani, Pilani Campus
The Internet: “nuts and bolts” view
mobile network
• Network edge: applications and hosts national or global ISP

• Network core: interconnected routers


local or
• Access networks: The network that Internet
regional ISP

physically connects an host home network content


provider
network datacenter
network

• Physical media: wired, wireless


communication links enterprise
network
9

Computer Networks CS F303 BITS Pilani, Pilani Campus


Two key network-core functions

routing algorithm Routing:


Forwarding: local
local forwarding
forwarding table
table
 global action:
• aka “switching” header value output link determine source-
• local action:
0100
0101
3
2 destination paths
move arriving
0111
1001
2
1 taken by packets
packets from  routing algorithms
router’s input link
1
to appropriate
router output link 3 2

destination address in arriving 10


packet’s header
Computer Networks CS F303 BITS Pilani, Pilani Campus
Access Networks Example

11
Computer Networks CS F303 BITS Pilani, Pilani Campus
Access networks: Data Center Networks
mobile network
 High-bandwidth links (10s to 100s national or global ISP
Gbps) connect hundreds to thousands
of servers together, and to Internet

local or
regional ISP

home network content


provider
network datacenter
network

Courtesy: Massachusetts Green High Performance Computing enterprise


Center (mghpcc.org) network
12

Computer Networks CS F303 BITS Pilani, Pilani Campus


Physical Media-Guided
• Twisted pair
– Two insulated copper wires
– Transmission rates supported are 100 Mbps, 1 Gbps, 10
Gbps
• Coaxial Cable
– Two concentric copper conductors
– Multiple channels on cable
• Fiber Optic Cable
– Glass fiber carrying light pulses, each pulse a bit
– High speed operation (10 Gbps to 100 Gbps)
– Low error rate

13
Computer Networks CS F303 BITS Pilani, Pilani Campus
Physical Media-Unguided
• Radio link types:
– Classified into three groups
• Short distance (a few meters), local areas (10 to a few hundred meters),
wide area (spans in tens of kms)
• LAN (e.g., WiFi)
– 11 Mbps, 54 Mbps, 10 Gbps (2020)
• Wide-area (e.g., cellular)
– 3G cellular: ~ few Mbps
– 4G cellular: ~100 Mbps
– 5G cellular: ~20 Gbps
• Satellite radio channels
– Provides BW from Kbps to 45Mbps channel (or multiple smaller channels) 14
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks

Question: given millions of access ISPs, how to connect them together?


access access
net net
access
net
access
access net
net
access
access net
net

access access
net net

access
net
access
net

access
net
access
net
access access
net access net
net
15

Computer Networks CS F303 BITS Pilani, Pilani Campus


Internet structure: network of networks
Option: connect each access ISP to every other access ISP?
access access
net net
access
net
access
access net
net
access
access net
net

connecting each access ISP


access
to each other directly doesn’t access
net
scale: O(N2) connections. net

access
net
access
net

access
net
access
net
access access
net access net
net
16
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks
Option: Connect each access ISP to a global transit ISP?
access access
net net
access
net
access
access net
net
access
access net
net

global
access
net
ISP access
net

access
net
access
net

access
net
access
net
access access
net access net
net
17
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks

Single global ISP does not scale, there are multiple global ISPs ….
access access
net net
access
net
access
access net
net
access
access net
net
ISP A

access access
net ISP B net

access
ISP C
net
access
net

access
net
access
net
access access
net access net
net
18
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks

Multiple global ISPs must be interconnected.


access access
Internet exchange point
net net
access
net
access
access net
net

access
IXP access
net
net
ISP A

access IXP access


net ISP B net

access
ISP C
net
access
net

access peering link


net
access
net
access access
net access net
net
Customer and provider ISPs have economic agreement. 19
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks

… and regional networks may arise to connect access nets to ISPs


access access
net net
access
net
access
access net
net

access
IXP access
net
net
ISP A

access IXP access


net ISP B net

access
ISP C
net
access
net

access
net regional net
access
net
access access
net access net
net
20
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks
… and content provider networks (e.g., Google, Microsoft, Akamai ) may run their own network, to bring
services, content close to end users
access access
net net
access
net
access
access net
net

access
IXP access
net
net
ISP A
Content provider network
access IXP access
net ISP B net

access
ISP B
net
access
net

access
net regional net
access
net
access access
net access net
net
21
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks

Google

IXP

22
Computer Networks CS F303 BITS Pilani, Pilani Campus
Tier-1 ISP: e.g., Sprint
POP: point-of-presence

to/from backbone

peering
… …



to/from customers

BITS Pilani, Pilani Campus


Layering of Airline Functionality

ticket (purchase) ticket (complain) ticket

baggage (check) baggage (claim baggage

gates (load) gates (unload) gate

runway (takeoff) runway (land) takeoff/landing

airplane routing airplane routing airplane routing airplane routing airplane routing

departure intermediate air-traffic arrival


airport control centers airport

Layers: Each layer implements a service


– Via its own internal-layer actions
– Relying on services provided by layer below
24
Computer Networks CS F303 BITS Pilani, Pilani Campus
Layered (Modular) Network Model (OSI)

Each layer performs specific operations.


Implementation of a layer can change by keeping interfaces intact.

25
Computer Networks CS F303 BITS Pilani, Pilani Campus
source
message
segment
M application Layer Encapsulation
Ht M transport
datagram Hn Ht M network
frame Hl Hn Ht M link
physical
link
physical

switch

destination Hn Ht M network
M application H l Hn Ht M link Hn Ht M
Ht M transport physical
Hn H t M network
Hl Hn Ht M link router
physical
26
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet Architecture Design Goals

0. Connect existing networks


Initially ARPANET and ARPA packet radio networks
1. Survivability
2. Support multiple types of services
Differ in Speed, Latency and Reliability
3. Must accommodate a variety of networks
4. ….
5. …

27
Advanced Computer Networks CS G525 BITS Pilani, Pilani Campus
Internet Hourglass Architecture

• Need to interconnect many existing networks


• Hide underlying technology from applications email WWW phone...

• Decisions: SMTP HTTP RTP...

TCP UDP…
Applications

– Network provides minimal functionality IP

– “Narrow waist” ethernet PPP…

– Best Effort Service…! CSMA async sonet...

copper fiber radio...


Technology

– Tradeoff No assumptions no guarantee

28
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet History

Source: https://ventcube.com/history-of-the-internet-timeline/

• How will the Internet look like after 10 years from now?
• Interested to know the journey of the Internet?
– A Brief History of the Internet [Leiner et. al., 2003]
29
Advanced Computer Networks CS G525 BITS Pilani, Pilani Campus
• Internet design philosophy
– The Design Philosophy of The DARPA Internet
Protocols [Clark 1988]

30
Advanced Computer Networks CS G525 BITS Pilani, Pilani Campus
The Network Core

• Mesh of interconnected routers


• How is data transferred through
network?

– Circuit switching: Dedicated circuit


per call example: telephone network
– Packet-switching: Data sent through
network in discrete “chunks”
example: Internet

31

Computer Networks CS F303 BITS Pilani, Pilani Campus


Circuit Switching: FDM and TDM
Example:
FDM
4 users

frequency

time
TDM

frequency

time 32
Computer Networks CS F303 BITS Pilani, Pilani Campus
Circuit Switch: Numerical example

• How long does it take to send a file of 640,000 bits from host A to host B
over a circuit-switched network?
– All links are 1.536 Mbps
– Each link uses TDM with 24 slots/sec
– 500 msec to establish end-to-end circuit

33
Computer Networks CS F303 BITS Pilani, Pilani Campus
Packet switching vs. Circuit switching

Packet switching allows more users to use network!


example:
 1 Mb/s link
N
 each user: users
• 100 kb/s when “active” 1 Mbps link
• active 10% of time

• Circuit-switching:
– How many users are supported?

Exercise: How did we get value 0.0004?


• Packet switching:
– With 35 users, probability > 10 active at same time is less than .0004 34
Computer Networks CS F303 BITS Pilani, Pilani Campus
Packet-switching: store-and-forward

L bits
per packet
3 2 1
source destination
R bps R bps

• Store and forward: entire packet must arrive at One-hop numerical example:
router before it can be transmitted on next link  L = 10 Kbits
• Packet transmission delay: takes L/R seconds to  R = 100 Mbps
transmit (push out) L-bit packet into link at R bps  one-hop transmission delay
= 0.1 msec
26

Computer Networks CS F303 BITS Pilani, Pilani Campus


End-to-End Delay

transmission
A C
propagation

B D
nodal
processing queueing

36
Computer Networks CS F303 BITS Pilani, Pilani Campus
Transmission Delay vs. Propagation Delay -
Caravan Analogy
100 km 100 km
ten-car toll toll
caravan booth booth

• Cars “propagate” at 100 km/hr


• Toll booth takes 12 sec to service a car (car transmission time)
• Car is analogous to bit; caravan is analogous to packet

37
Computer Networks CS F303 BITS Pilani, Pilani Campus
Questions???
1. How long does it take for the entire caravan to receive service at the tollbooth
(that is the time from when the first car enters service until the last car leaves the
tollbooth)?
2. Once the first car leaves the tollbooth, how long does it take until it arrives at the
next tollbooth?
3. Once the first car leaves the tollbooth, how long does it take until it enters
service at the next tollbooth?
4. Are there ever two cars in service at the same time, one at the first toll booth
and one at the second toll booth? Yes or No.
5. Are there ever zero cars in service at the same time, i.e., the caravan of cars has
finished at the first toll both but not yet arrived at the second tollbooth? Yes or
No
6. How much time does it take to lined up the caravan before the next tollbooth?

38
Computer Networks CS F303 BITS Pilani, Pilani Campus
Caravan analogy [..2]
100 km 100 km
ten-car toll toll
caravan booth booth

• Cars now “propagate” at 1000 km/hr


• Toll booth now takes 1 min to service a car

1. Are there ever two cars arrive at the same time, one at the first toll booth and one at the
second toll booth? Yes or No.
2. Are there ever zero cars in service at the same time, i.e., the caravan of cars has finished at
the first toll both but not yet arrived at the second tollbooth? Yes or No

39
Computer Networks CS F303 BITS Pilani, Pilani Campus
Queuing Delay
• R=link bandwidth (bps)
• L=packet length (bits)
• a=average packet arrival rate

traffic intensity = La/R

 La/R > 1: more “work” arriving than can be serviced, average delay infinite!
 La/R <= 1: delays become large
 La/R ~ 0: average queueing delay small
40
Computer Networks CS F303 BITS Pilani, Pilani Campus
“Real” Internet delays and routes

• What do “real” Internet delay & loss look like?


• Traceroute program: provides delay measurement from source to router
along end-to-end Internet path towards destination. For all i:
– Sends three packets that will reach router i on path towards destination
– Router i will return packets to sender
– Sender times interval between transmission and reply.
– Read RFC 1393 for more detail !!!
• http://traceroute.org
3 probes 3 probes

3 probes

41
Computer Networks CS F303 BITS Pilani, Pilani Campus
Real Internet delays and routes

traceroute: gaia.cs.umass.edu to www.eurecom.fr


3 delay measurements from
gaia.cs.umass.edu to cs-gw.cs.umass.edu
1 cs-gw (128.119.240.254) 1 ms 1 ms 2 ms 3 delay measurements
2 border1-rt-fa5-1-0.gw.umass.edu (128.119.3.145) 1 ms 1 ms 2 ms
3 cht-vbns.gw.umass.edu (128.119.3.130) 6 ms 5 ms 5 ms to border1-rt-fa5-1-0.gw.umass.edu
4 jn1-at1-0-0-19.wor.vbns.net (204.147.132.129) 16 ms 11 ms 13 ms
5 jn1-so7-0-0-0.wae.vbns.net (204.147.136.136) 21 ms 18 ms 18 ms
6 abilene-vbns.abilene.ucaid.edu (198.32.11.9) 22 ms 18 ms 22 ms
7 nycm-wash.abilene.ucaid.edu (198.32.8.46) 22 ms 22 ms 22 ms trans-oceanic link
8 62.40.103.253 (62.40.103.253) 104 ms 109 ms 106 ms
9 de2-1.de1.de.geant.net (62.40.96.129) 109 ms 102 ms 104 ms looks like delays decrease!
10 de.fr1.fr.geant.net (62.40.96.50) 113 ms 121 ms 114 ms
11 renater-gw.fr1.fr.geant.net (62.40.103.54) 112 ms 114 ms 112 ms Why?
12 nio-n2.cssi.renater.fr (193.51.206.13) 111 ms 114 ms 116 ms
13 nice.cssi.renater.fr (195.220.98.102) 123 ms 125 ms 124 ms
14 r3t2-nice.cssi.renater.fr (195.220.98.110) 126 ms 126 ms 124 ms
15 eurecom-valbonne.r3t2.ft.net (193.48.50.54) 135 ms 128 ms 133 ms
16 194.214.211.25 (194.214.211.25) 126 ms 128 ms 126 ms
17 * * *
18 * * * * means no response (probe lost, router not replying)
19 fantasia.eurecom.fr (193.55.113.142) 132 ms 128 ms 136 ms
37

BITS Pilani, Pilani Campus


Packet loss
 Queue (aka buffer) preceding link has finite capacity
 Packet arriving to full queue dropped (aka lost)
 Lost packet may be retransmitted by previous node, by source end
system, or not at all
buffer
(waiting area) packet being transmitted
A

B
packet arriving to
full buffer is lost

34

Computer Networks CS F303 BITS Pilani, Pilani Campus


Throughput
 Throughput: rate (bits/time) at which bits are being sent from sender to receiver
• Instantaneous: rate at given point in time
• Average: rate over longer period of time

link capacity
pipe that can carry linkthat
pipe capacity
can carry
Rsfluid
bits/sec
at rate Rfluid
c bits/sec
at rate
serverserver,
sends with
bits
(fluid) into pipe (Rs bits/sec) (Rc bits/sec)
file of F bits 44
to send to client
Computer Networks CS F303 BITS Pilani, Pilani Campus
Throughput
 Throughput: rate (bits/time) at which bits are being sent from sender to receiver
• Instantaneous: rate at given point in time
• Average: rate over longer period of time

link capacity
pipe that can carry linkthat
pipe capacity
can carry
Rsfluid
bits/sec
at rate Rfluid
c bits/sec
at rate
serverserver,
sends with
bits
(fluid) into pipe (Rs bits/sec) (Rc bits/sec)
file of F bits 45
to send to client
Computer Networks CS F303 BITS Pilani, Pilani Campus
Throughput: network scenario

Rs
Rs Rs

Rc Rc
Rc

10 connections (fairly) share


backbone bottleneck link R bits/sec 36

Computer Networks CS F303 BITS Pilani, Pilani Campus


Exercise

47
Computer Networks CS F303 BITS Pilani, Pilani Campus
We have learned so far…

• Components of computer networks and Internet.


– Edge, network core, access networks, communication channels.
– Hosts, links, routers, switches, protocols, etc.
• Layered architecture of computer networks
– OSI model vs. TCP/IP model
• Network core design
– Packet switched networks vs. circuit switched networks
• Network performance measurement parameters
– Delay, throughput, bandwidth 48
Computer Networks CS F303 BITS Pilani, Pilani Campus
Thank You!

49
Computer Networks CS F303 BITS Pilani, Pilani Campus
Computer Networks (CS F303)
Virendra Singh Shekhawat
BITS Pilani Department of Computer Science and Information Systems
Pilani Campus
Application Layer

• Network Application Architectures


– Client-Server, Peer-to-Peer, Hybrid (CS+P2P)
• Client Server Applications
– Web Browser/Web Server, File Transfer, Email, Telnet.
• Peer-to-Peer Applications
– BitTorrent, eDonkey, Emule, Gnutella etc.
• Hybrid Applications
– Skype
• Cloud Based and CDN Based Applications
– YouTube, MS Teams, Zoom, Google Meet, etc.
• Internet Directory Structure
– DNS Protocol
2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
What is a Network Application?

• Programs that run on different end systems application


transport
network
and communicate over a network data link
physical
– e.g., Web: Web server app. communicates with
browser app.

• Network core devices do not run user


application code

• Application on end systems allows for rapid application


application
transport
transport network
application development network data link
data link physical
physical

Computer Networks (CS F303) BITS Pilani, Pilani Campus


Application architectures

• Client-server
• Peer-to-Peer (P2P)
• Hybrid of client-server and P2P

4
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Client-Server Architecture

Server:
– “always-on” host
– Permanent IP address
– For scaling, data center is used to create
large powerful virtual server

Clients:
– Communicate with server
– May be intermittently connected
– May have dynamic IP addresses
– Clients do not communicate directly
with each other
5
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Pure P2P Architecture

• No “always-on” server
• Arbitrary end systems directly communicate
• Peers are connected and change IP addresses
– Example: Freenet and BitTorrent (File Sharing Apps)

Highly scalable but difficult to manage!!!

6
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Hybrid of CS and P2P

Skype
– Internet telephony application
– Finding address of remote party: centralized server(s)
– Client-client connection is direct (not through server)

Instant messaging
– Chatting between two users is P2P
– Presence detection/location centralized:
• User registers its IP address with central server when it comes online
• User contacts central server to find IP addresses of buddies

7
Computer Networks (CS F303) BITS Pilani, Pilani Campus
How do Network Applications Communicate?

• Process sends/receives messages host or host or


server
server
to/from its Socket
– Socket is the interface between the controlled by
app developer
application layer and the transport layer process process
within the host socket socket
TCP with TCP with
buffers, Internet buffers,
• Within same host, two processes variables variables
communicate using inter-process
communication controlled
by OS

• Processes in different hosts


communicate by exchanging messages 8
Computer Networks (CS F303) BITS Pilani, Pilani Campus
How to identify a process running on a
machine?
• To receive messages, process must have
identifier host or
server
host or
server

• Is IP address of host sufficient for identifying P1 P2 P3 P4


the process? socket socket socket socket

TCP with TCP with


buffers, Internet buffers,
• Process identifier = IP address + port number variables variables

– e.g., HTTP server: 80, Mail server (SMTP): 25


– List of well known port numbers is available at
http://www.iana.org
9
Computer Networks (CS F303) BITS Pilani, Pilani Campus
What transport service does an app need?

• Data loss
– Some apps (e.g., audio, video) can tolerate some loss
– Other apps (e.g., file transfer, telnet) require 100% reliable data transfer
• Bandwidth
– Some apps (e.g., multimedia) require minimum amount of bandwidth to be
“effective”
– Other apps (“elastic apps”) make use of whatever bandwidth they get
– ex. E-mail, File Transfer
• Timing
– Some apps (e.g., Internet telephony, interactive games) require low delay to be
“effective”
10
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Web and HTTP [1994]
Web page consists of objects
• Object can be HTML file, JPEG image, Java applet,
audio file,…
• Web page consists of base HTML-file which
includes several referenced objects
• Each object is addressable by a URL
• Example URLs:
https://www.bits-pilani.ac.in/pilani/computerscience/ProgrammesOffered
https://www.bits-pilani.ac.in/pilani/computerscience/Faculty

11
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Protocol Overview [.1]

• Types of messages exchanged


– e.g., request, response
• Message syntax: PC running
Firefox browser
– What fields in messages & how fields
are delineated
• Message semantics server
running
– Meaning of information in fields Apache Web
server
• Rules for when and how processes
send & respond to messages iphone running
Safari browser

12
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Protocol Overview [..2]

Uses TCP:
• Client initiates TCP connection (creates
socket) to server at port 80
• Server accepts TCP connection from client initiate TCP
connection
RTT

• HTTP messages exchanged between request


file
browser (HTTP client) and Web server RTT
time to
transmit
(HTTP server) file
file
received
• TCP connection closed
time time

13
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Request Message

carriage return character


line-feed character
request line
(GET, POST, GET /index.html HTTP/1.1\r\n
HEAD commands) Host: www-net.cs.umass.edu\r\n
User-Agent: Firefox/3.6.10\r\n
Accept: text/html,application/xhtml+xml\r\n
header Accept-Language: en-us,en;q=0.5\r\n
lines Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,utf-8;q=0.7\r\n
carriage return, Connection: keep-alive\r\n
line feed at start \r\n
data data data data data ...
of line indicates
end of header lines 14
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Response Message
status line
(protocol
status code HTTP/1.1 200 OK\r\n
status phrase) Date: Sun, 26 Sep 2010 20:09:20 GMT\r\n
Server: Apache/2.0.52 (CentOS)\r\n
Last-Modified: Tue, 30 Oct 2007 17:00:02 GMT\r\n
ETag: "17dc6-a5c-bf716880"\r\n
header Accept-Ranges: bytes\r\n
Content-Length: 2652\r\n
lines Keep-Alive: timeout=10, max=100\r\n
Connection: Keep-Alive\r\n
Content-Type: text/html; charset=ISO-8859-1\r\n
\r\n
data data data data data ...

data, e.g.,
requested
HTML file
15
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Method Types
HTTP/1.0
• GET HTTP/1.1
– Include user data in URL field of HTTP GET • Additional methods
request message (following a ‘?’): • PUT
• https://www.bitsadmission.com/bitsatmain.aspx?id – Uploads file in entity body to path
=11012016 specified in URL field
• POST • DELETE
– User input sent from client to server in entity – Deletes file specified in the URL
body of HTTP POST request message field

• HEAD
– Asks server to leave requested object out of
response 16
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Response status Codes
200 OK
– request succeeded, requested object later in this msg
301 Moved Permanently
– requested object moved, new location specified later in this msg (Location:)
400 Bad Request
– request msg not understood by server
404 Not Found
– requested document not found on this server
505 HTTP Version Not Supported
– the HTTP version used in the request is not supported by the server.

17
Computer Networks (CS F303) BITS Pilani, Pilani Campus
How does a Webpage transfer?

• Let’s assume a web page consists of a base HTML file and 5 JPEG images.

Non-persistent HTTP
• At most one object is sent over a TCP connection
• HTTP/1.0 uses non-persistent HTTP

Persistent HTTP
• Multiple objects can be sent over single TCP connection between client and server.
• Persistent with Pipeline vs. Persistent without Pipeline
• HTTP/1.1 uses persistent connections in default mode
18
Computer Networks (CS F303) BITS Pilani, Pilani Campus
State in HTTP using “Cookies”
client server

ebay 8734
usual http request msg Amazon server
cookie file creates ID
usual http response
1678 for user create backend
ebay 8734
set-cookie: 1678 entry database
amazon 1678
usual http request msg
cookie: 1678 cookie- access
specific
usual http response msg action

one week later:


access
ebay 8734 usual http request msg
amazon 1678 cookie: 1678 cookie-
specific
usual http response msg action 19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Example: displaying a NY Times web page

1 GET base html file


2
from nytimes.com

4 fetch ad from nytimes.com


5
AdX.com
HTTP 1 2 HTTP
7 display composed GET reply
page

3 4
6 5

NY times page with


embedded ad displayed AdX.com
20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Cookies: Tracking a user’s browsing behavior

1634: sports, 2/15/22

nytimes.com (sports) “first party” cookie –


from website you chose
to visit (provides base
HTTP HTTP
GET reply html file)
Set cookie: 1634

HTTP GET
Referer: NY Times Sports
4
7493: NY Times sports, 2/15/22
5
“third party” cookie – HTTP reply
from website you did not NY Times: 1634 Set cookie: 7493
choose to visit AdX: 7493
AdX.com
21
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Cookies: tracking a user’s browsing behavior

1634: sports, 2/15/22

nytimes.com AdX:
 tracks my web browsing
socks.com over sites with AdX ads
2
HTTP 1  can return targeted ads
GET based on browsing history
HTTP GET
Referer: socks.com, cookie: 7493
4
7493: NY Times sports, 2/15/22
5 7493: socks.com, 2/16/22
HTTP reply
NY Times: 1634 Set cookie: 7493
AdX: 7493
AdX.com
22
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Cookies: tracking a user’s browsing behavior (one day later)

1634: sports, 2/15/22


1634: arts, 2/17/22

nytimes.com (arts)
socks.com HTTP HTTP
GET reply
cookie: 1634 Set cookie: 1634

HTTP GET
Referer:nytimes.com, cookie: 7493
4
7493: NY Times sports, 2/15/22
5 7493: socks.com, 2/16/22
HTTP reply 7493: NY Times arts, 2/15/22
NY Times: 1634 Set cookie: 7493
AdX: 7493 Returned ad for socks!
AdX.com
23
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Web Caches (aka Proxy Server)
Goal: satisfy client requests without involving origin server

Web
cache
client
origin
server

client

24
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Conditional GET
client server
• Goal: don’t send object if cache
has up-to-date cached version HTTP request msg
If-modified-since: <date> object
not
• cache: specify date of cached HTTP response
modified
before
copy in HTTP request HTTP/1.0
<date>
304 Not Modified
If-modified-since: <date>

• server: response contains no


HTTP request msg
object if cached copy is up-to- If-modified-since: <date> object
date: modified
HTTP response after
HTTP/1.0 304 Not Modified
HTTP/1.0 200 OK <date>
<data>
25
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Caching example
Scenario:
 Access link rate: 1.54 Mbps origin
 RTT from institutional router to server: 2 sec servers
 Web object size: 100K bits public
Internet
 Average request rate from browsers to origin
servers: 15/sec
 Avg data rate to browsers: 1.50 Mbps
1.54 Mbps
access link
Performance:
Problem: large
 Access link utilization = .97 queueing delays institutional
network
1 Gbps LAN
 LAN utilization: .0015 at high utilization!
 End-end delay = Internet delay +
access link delay + LAN delay
= 2 sec + minutes + msecs
25
BITS Pilani, Pilani Campus
Option 1: Buy a faster access link

Scenario: 154 Mbps


 Access link rate: 1.54 Mbps origin
 RTT from institutional router to server: 2 sec servers
 Web object size: 100K bits public
Internet
 Average request rate from browsers to origin
servers: 15/sec
 Avg data rate to browsers: 1.50 Mbps 154 Mbps
1.54 Mbps
access link
Performance:
 Access link utilization = .97 .0097 institutional
network
1 Gbps LAN
 LAN utilization: .0015
 End-end delay = Internet delay +
access link delay + LAN delay
= 2 sec + minutes + msecs
26
Cost: faster access link (expensive!) msecs
BITS Pilani, Pilani Campus
Option 2: Install a web cache

Scenario:
 Access link rate: 1.54 Mbps origin
 RTT from institutional router to server: 2 sec servers
 Web object size: 100K bits public
Internet
 Average request rate from browsers to origin
servers: 15/sec
 Avg data rate to browsers: 1.50 Mbps
1.54 Mbps
access link
Cost: web cache (cheap!)
institutional
network
Performance: 1 Gbps LAN
 LAN utilization: .? How to compute link
 Access link utilization = ? utilization, delay?
 Average end-end delay = ? local web cache

27
BITS Pilani, Pilani Campus
Calculating Access Link Utilization and
End-to-end Delay with Cache

Suppose cache hit rate is 0.4


 40% requests served by cache, with low (msec) origin
servers
delay public
 Rate to browsers over access link Internet

= 0.6 * 1.50 Mbps = 0.9 Mbps


• Access link utilization = 0.9/1.54 = 0.58 means
low (msec) queueing delay at access link 1.54 Mbps
access link
institutional
 Average end-end delay: network
1 Gbps LAN
= 0.6 * (delay from origin servers)
+ 0.4 * (delay when satisfied at cache)
= 0.6 (2.01) + 0.4 (~msecs) = ~ 1.2 secs local web cache

Lower average end-end delay than with 154 Mbps link (and cheaper too!) 28
BITS Pilani, Pilani Campus
HTTP/1.1: HOL blocking

HTTP 1.1: client requests 1 large object (e.g., video file) and 3 smaller objects
server

GET O4 GET O3 GET O


2 GET O1 object data requested
client

O1

O2
O1 O3
O2
O3
O4
O4
Server responds in-order (FCFS) to multiple GET requests 30
BITS Pilani, Pilani Campus
HTTP/2: Stream Multiplexing

• What is a stream?
– Bi-directional sequence of frames sent over the HTTP/2 protocol
exchanged between the server and client

31
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP/2: Mitigating HOL blocking

HTTP/2: Objects divided into frames, frame transmission interleaved


server

GET O4 GET O3 GET O


2 GET O1 object data requested
client
O2
O4
O3 O1

O2
O3
O1
O4

O2, O3, O4 delivered quickly, O1 slightly delayed 32


BITS Pilani, Pilani Campus
HTTP/2

Key goal: decreased delay in multi-object HTTP requests

HTTP/2: [RFC 7540, 2015] increased flexibility at server in sending objects to


client:
 Methods, status codes, most header fields unchanged from HTTP 1.1
 Divide objects into frames, schedule frames to mitigate HOL blocking
 Also push unrequested objects to client

33
BITS Pilani, Pilani Campus
Binary Framing Layer

• HTTP/2 allows transmission of parallel multiplexed requests and responses


– HTTP/2 breaks down the HTTP protocol communication into an exchange of binary-
encoded frames, which are then mapped to messages that belong to a particular
stream, all of which are multiplexed within a single TCP connection.

34
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP/2.0 Connection
• Stream: A bidirectional flow of bytes
within an established connection, which
may carry one or more messages.
• Message: A complete sequence of
frames that map to a logical request or
response message.
• Frame: The smallest unit of
communication in HTTP/2, each
containing a frame header, which at a
minimum identifies the stream to which
the frame belongs.

35
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Domain Name System (DNS)

• The domain name system maps the name people use to locate a website to
the IP address that a computer uses to locate a website.

• Why do we need the mapping between host name and IP address?

• Application-layer protocol: hosts, name servers communicate to resolve


names (address/name translation)

36
Computer Networks CS F303 BITS Pilani, Pilani Campus
What happens when a browser requests a
URL?
• User machine runs the client side of the DNS application
• The browser extracts the hostname from the URL and passes
the hostname to the client side of the DNS application
• The DNS client sends a query to a DNS server
• The DNS client receives a reply with IP address of the
hostname
• The browser initiates a TCP connection to the HTTP server
located at port 80
37
Computer Networks (CS F303) BITS Pilani, Pilani Campus
DNS Structure – Distributed Hierarchical
Database
Root DNS Servers

… …

com DNS servers org DNS servers edu DNS servers

pbs.org poly.edu umass.edu


yahoo.com amazon.com
DNS servers DNS serversDNS servers
DNS servers DNS servers

Client wants IP for www.amazon.com; 1st approx:


• Client queries root server to find com DNS server
• Client queries .com DNS server to get amazon.com DNS server
• Client queries amazon.com DNS server to get IP address for
www.amazon.com
List of all top level domain servers is available at: https://www.icann.org/resources/pages/tlds-2012-02-25-en 38
Computer Networks CS F303 BITS Pilani, Pilani Campus
Root Name Servers

• Root name server:


– Total 13 server, mostly located in North America.
– Each server is actually a network of replicated servers (~200 servers in US)
– ICANN manages root DNS domain

c. Cogent, Herndon, VA (5 other sites)


d. U Maryland College Park, MD k. RIPE London (17 other sites)
h. ARL Aberdeen, MD
j. Verisign, Dulles VA (69 other sites ) i. Netnod, Stockholm (37 other sites)

e. NASA Mt View, CA m. WIDE Tokyo


f. Internet Software C. (5 other sites)
Palo Alto, CA (and 48 other
sites)

a. Verisign, Los Angeles CA 13 root name


(5 other sites)
b. USC-ISI Marina del Rey, CA
“servers”
l. ICANN Los Angeles, CA worldwide
(41 other sites)
g. US DoD Columbus,
OH (5 other sites)

39
Computer Networks CS F303 BITS Pilani, Pilani Campus
Top-Level Domain, and Authoritative Servers
Top-Level Domain (TLD) servers:
 Responsible for .com, .org, .net, .edu, .aero, .jobs, .museums, and all top-level
country domains, e.g.: .cn, .uk, .fr, .ca, .jp
 Network Solutions: authoritative registry for .com, .net TLD
 Educause: .edu TLD

Authoritative DNS servers:


 Organization’s own DNS server(s), providing authoritative hostname to IP
mappings for organization’s named hosts
 Can be maintained by organization or service provider
40
Computer Networks (CS F303) BITS Pilani, Pilani Campus
DNS Services

• Hostname to IP address translation


– Host name to IP address mapping
• Host aliasing
– Canonical name to alias name(s) mapping

• Mail server aliasing


– Canonical name to alias name mapping

• Load distribution
– Replicated Web servers: many IP addresses correspond to one name
41
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Query Processing - Recursive root DNS server

2 3
7
6
Recursive query: TLD DNS
 Puts burden of name resolution server
on contacted name server local DNS server
 Heavy load at upper levels of dns.poly.edu 5 4

hierarchy? 1 8

authoritative DNS server


dns.cs.umass.edu
requesting host
cis.poly.edu

gaia.cs.umass.edu

42
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Query Processing - Iterative root DNS server

2
• TLD server may know only of an 3
intermediate DNS server for the TLD DNS server
4
hostname, which in turn knows the
authoritative DNS server for the 5
hostname. local DNS server
dns.poly.edu
7 6
1
• DNS responses are usually cached to 8

improve the delay performance and to authoritative DNS server


reduce the number of DNS messages dns.cs.umass.edu
requesting host
– e.g., Local DNS server caches the TLD server cis.poly.edu
information
gaia.cs.umass.edu

43
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Records
DNS: distributed database for storing resource records (RR)
RR format: (name, value, type, ttl)

type=A type=CNAME
 name is hostname  name is alias name for some
 value is IP address
“canonical” (the real) name
 www.ibm.com is really
type=NS servereast.backup2.ibm.com
– name is domain (e.g., foo.com)  value is canonical name
– value is hostname of authoritative
name server for this domain type=MX
 value is canonical name of mailserver
associated with name, i.e., mail
server aliasing
44
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Messages
• Query and reply messages, both with same message format
• Explore DNS protocol in Lab Session #2 2 bytes 2 bytes

Msg Header identification flags


 Identification: 16 bit # for # questions # answer RRs
Name, Type fields
query, reply to query uses
# additional RRs for a query
same # # authority RRs
 Flags:
questions (variable # of questions)
 query or reply
 recursion desired RRs in response to
 recursion available answers (variable # of RRs) query
 reply is authoritative
Records for
authority (variable # of RRs)
authoritative servers

additional info (variable # of RRs)


45
Computer Networks CS F303 BITS Pilani, Pilani Campus
Inserting Records into DNS
• A newly created domain name should be first registered at a registrar
– Internet Cooperation of Assigned Names and Numbers (ICANN) accredits the registrars
– Accredited registrar list is available at www.internic.net
– Registrar is a commercial entity that verifies the uniqueness of the domain name.

46
Computer Networks CS F303 BITS Pilani, Pilani Campus
FTP: File Transfer Protocol
file transfer
FTP FTP FTP
user client server
interface
user
at host remote file
local file system
system

 Transfer file to/from remote host


 Client/server model
 Client: side that initiates transfer (either to/from remote)
 Server: remote host
 ftp: RFC 959
 ftp server: port 21

47
Computer Networks CS F303 BITS Pilani, Pilani Campus
FTP: Connections

• Control connection TCP control connection,


server port 21
– Authorization, directory listing
etc. TCP data connection,
FTP server port 20 FTP
client server
• When server receives file
transfer command,
– Server opens 2nd TCP data
connection (for file) to client

• After transferring one file,


server closes data connection

48
Computer Networks CS F303 BITS Pilani, Pilani Campus
FTP Commands and Responses

Sample commands: Sample return codes


• Sent as ASCII text over control • Status code and phrase
channel (as in HTTP)
• USER username • 331 Username OK,
• PASS password password required
• 125 data connection
• LIST return list of file in already open; transfer
current directory starting
• RETR filename retrieves • 425 Can’t open data
(gets) file connection
• 452 Error writing file
• STOR filename stores
(puts) file onto remote host

49
Computer Networks CS F303 BITS Pilani, Pilani Campus
eMail
outgoing
user message queue
Three major components: agent
user mailbox
• User agents mail user
server
– e.g., Outlook, Thunderbird agent

• Mail servers SMTP mail user


– Contains incoming messages for user server agent

• Simple mail transfer protocol: SMTP


– SMTP SMTP user
agent
mail
server
user
agent
user
agent

50
Computer Networks CS F303 BITS Pilani, Pilani Campus
SMTP [RFC 5321, Original RFC 821]

• Uses TCP to reliably transfer email message from client to server, port 25
• Direct transfer: sender’s mail server to receiver’s mail server
• Three phases of transfer
– Handshaking (greeting)Transfer of messagesConnection Closure
• Command/response interaction (like HTTP, FTP)
– Commands: ASCII text
– Response: status code and phrase
• Messages must be in 7-bit ASCII
– Painful for multimedia data

51
Computer Networks CS F303 BITS Pilani, Pilani Campus
Mail Transfer Process

S: 220 hamburger.edu
C: HELO crepes.fr
S: 250 Hello crepes.fr, pleased to meet you
C: MAIL FROM: <[email protected]>
S: 250 [email protected]... Sender ok
C: RCPT TO: <[email protected]>
S: 250 [email protected] ... Recipient ok
C: DATA
S: 354 Enter mail, end with "." on a line by itself
C: Do you like ketchup?
C: How about pickles?
C: .
S: 250 Message accepted for delivery
C: QUIT
S: 221 hamburger.edu closing connection 52
Computer Networks CS F303 BITS Pilani, Pilani Campus
Mail Access Protocols

• Mail access protocol: retrieval from server


– POP3 [Port:110]: Post Office Protocol [RFC 1939]: authorization, download and keep,
download and delete
• User can create folders and move the messages into them locally.
• Stateless across the sessions
– IMAP: Internet Mail Access Protocol [RFC 1730]: more features, including manipulation of
stored msgs on server
• Allows to create remote folders and maintains user state information across IMAP sessions
• Permit a user agent to obtain components of messages. Good for low bandwidth connections. 53
Computer Networks CS F303 BITS Pilani, Pilani Campus
POP3 Protocol
S: +OK POP3 server ready
Authorization phase C: user alex
S: +OK
• Client commands: C: pass hungry
– user: declare username S: +OK user successfully logged on
– pass: password C: list
S: 1 498
• Server responses S: 2 912
– +OK S: .
– -ERR C: retr 1
S: <message 1 contents>
Transaction phase, client: S: .
C: dele 1
• list: list message numbers
C: retr 2
• retr: retrieve message by number S: <message 1 contents>
• dele: delete S: .
• quit C: dele 2
C: quit
S: +OK POP3 server signing off 54
Computer Networks CS F303 BITS Pilani, Pilani Campus
Web based E-Mail

• Hotmail introduced Web-based access in the 1990s

55
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Thank You!

56
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Computer Networks (CS F303)
Virendra Singh Shekhawat
BITS Pilani Department of Computer Science and Information Systems
Pilani Campus
Today’s Agenda

• Peer to Peer Applications and Protocols


– P2P File Distribution, Bit Torrent Protocol

• Database Implementation Protocol in P2P Networks


– Distributed Hash Tables (DHTs)
– Chord Protocol

2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Peer to Peer (P2P) Architecture

• No always-on server
• Arbitrary end systems directly communicate
• Peers are intermittently connected
• Examples
– File distribution (BitTorrent)
– Streaming (KanKan)
– VoIP (Skype)

3
Computer Networks CS F303 BITS Pilani, Pilani Campus
File Distribution: P2P vs CS
How much time required to distribute file (size F) from one server to N peers?
– Peer upload/download capacity is limited resource

us: server upload


capacity

di: peer i download


file, size F u1 d1 capacity
us u2 d2
server
di
uN network (with abundant
bandwidth) ui
dN
ui: peer i upload
capacity

4
Computer Networks CS F303 BITS Pilani, Pilani Campus
File Distribution Time – Client Server
• Server transmission: must F
sequentially send (upload) N us

file copies: di
network
– Time to send N copies: NF/us ui

 Client: each client must download file copy


 dmin = min client download rate
 Slowest client download time: F/dmin

time to distribute F
to N clients using
client-server approach Dc-s > max{NF/us,,F/dmin}

5
Computer Networks CS F303 BITS Pilani, Pilani Campus
File Distribution Time - Peer to Peer
• Server transmission: must upload at least one copy
F
– time to send one copy: F/us us
di
 Client: each client must download file copy network
 Slowest client download time: F/dmin ui

 Clients: as aggregate must download NF bits


 max upload rate (limiting max download rate) is us + Sui

Time to distribute F
to N clients using
P2P approach
DP2P > max{F/us,,F/dmin,,NF/(us + Sui)}

6
Computer Networks CS F303 BITS Pilani, Pilani Campus
Exercise

• Distributing a File F = 15 Gbits to 100 peers


• Server upload rate is us = 30 Mbps
• Each peer download rate is di = 2 Mbps
• Each peer upload rate is u = 0.5 Mbps
• Question
– Calculate minimum distribution time for F in both CS and P2P scenarios.

7
Computer Networks (CS F303) BITS Pilani, Pilani Campus
CS vs P2P: Example

client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us


3.5
P2P

Minimum Distribution Time


3
Client-Server
2.5

1.5

0.5

0
0 5 10 15 20 25 30 35

N
8
Computer Networks CS F303 BITS Pilani, Pilani Campus
P2P File Distribution Example: BitTorrent

• File divided into 64 KB to 1 MB size (typically


256 KB) chunks

• Peers in torrent send/receive file chunks


– At any given time, each peer will have a subset of
chunks from the file.
– A peer asks its neighbors for the list of chunks they
have and gets list from each.
– A peer needs to take a call on-
• Which chunks should it request first from its neighbor?
• To which of its neighbors it should send requested
chunks?
– Tit-for-tat trading algorithm

9
Computer Networks CS F303 BITS Pilani, Pilani Campus
The lookup problem

N2 N3
N1

Internet
Key = “data item”
Value = video lecture ?
Client
Publisher
N4 N6 Lookup(“data item”)

N5
Decentralized network with several peers (servers/clients)
How to find specific peer that hosts desired data within this network?
10
Computer Networks CS F303 BITS Pilani, Pilani Campus
P2P Protocols

Napster

Gnutella
Kazaa (Skype is based on Kazaa)
11
Computer Networks CS F303 BITS Pilani, Pilani Campus
Distributed Hash Table (DHT)
• Key-value pairs are distributed across peers.

• Any Peer can query the distributed database with a particular key.

• Distributed DB locate the Peers that have the corresponding (key,


value) pairs and return to the querying Peer.

• Each peer only hold a small subset of the total key-value pairs.

• Any Peer can insert new (key, value) pairs into the DB.

12
Computer Networks CS F303 BITS Pilani, Pilani Campus
DHT Implementation [.1]

• Randomly scatter the (key, value) pairs across all the peers.

• Each peer maintain a list of the IP addresses of all peers.

• The querying peer sends its query to all other peers.

• The peers containing the (key, value) pairs that match the key can respond
with matching pairs.

13
Computer Networks CS F303 BITS Pilani, Pilani Campus
DHT Implementation [..2]: Circular DHT

• Hash function assigns each “node” and “key” an m-bit identifier using a
base hash function such as SHA-1
– Node_ID = hash(IP, Port)
N63
– Key_ID = hash(original key) N60 N2
k7
ID Space: 0 to 2m-1 k58
N10
Here: m = 6 k11

N50 k16
Range = 64
k46 N20

Assign (key-value) pair to the peer that has N40 k39 k25

the closest ID.


14
Computer Networks CS F303 BITS Pilani, Pilani Campus
Chord Protocol:Lookup Operation Example

Predecessor: pointer to the previous node on the id


circle
Successor: pointer to the succeeding node on the
id circle

 ask node n to find the successor of id


 If id between n and its successor
return successor
 else forward query to n´s successor and
so on

Algorithm complexity??? 15
Computer Networks CS F303 BITS Pilani, Pilani Campus
Scalable node localization
• Each node n contains a routing table with up-to m entries
(m: number of bits of the identifier) => finger table
• ith entry in the table at node n contains the first node s that
succeds n by at least 2i-1
– s = successor (n + 2i-1)
– s is called the ith finger of node n

16
Computer Networks CS F303 BITS Pilani, Pilani Campus
Scalable node localization: Algorithm

• Search in finger table for the node which


is most immediatly precedes the key
• Invoke find_successor from that node

//ask node n to find the successor of id


n.find_successor(id)
if (id in (n,successor))
return successor
else //search the local table for the
//highest predecessor of id
n‘ = closest_preceding_node(id); n.closest_preceding_node(id)
return n‘.find_successor(id); for i=m down to 1 do
if (finger[i] in (n,id))
return finger[i];
Algorithm‘s Complexity??? return n; 17

Computer Networks CS F303 BITS Pilani, Pilani Campus


Failure Recovery (Peer Churn)

• Key step in failure recovery is maintaining correct successor pointers


• To achieve this, each node maintains a successor-list of its r nearest
successors on the ring
• If node n notices that its successor has failed, it replaces it with the first
live entry in the successor list
• The stabilize will correct finger table entries and successor-list entries
pointing to failed node
• Stabilization protocol should be invoked based on the frequency of
nodes leaving and joining
18
Computer Networks CS F303 BITS Pilani, Pilani Campus
Next… Transport Layer

• Creating network Applications


– Socket Programming
• TCP vs. UDP Sockets
• Transport Layer
– Transport Layer Services
• Multiplexing/Demultiplexing
– Connectionless and Connection Oriented
» TCP and UDP
• Reliable data transfer (Protocol design)
• Flow control
• Congestion control
19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Programming [.1]
• What is a socket?
– To the kernel, a socket is an endpoint of communication.
– To an application, a socket is a file descriptor that lets the application read/write from/to
the network.
• Remember: All Unix I/O devices, including networks, are modeled as files.

• Clients and servers communicate with each other by reading from and writing
to socket descriptors.
application application
socket controlled by
process process app developer

transport transport
network network controlled
link by OS
link Internet
physical physical

20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Programming [..2]

Two socket types for two different transport services:


– UDP: unreliable datagram
– TCP: reliable, byte stream-oriented

Application Example:
1. Client reads a line of characters (data) from its keyboard and
sends the data to the server.
2. The server receives the data and converts characters to
uppercase.
3. The server sends the modified data to the client.
4. The client receives the modified data and displays the line on
its screen.
21
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Programming with TCP

Client contacts server by: • When contacted by client, server TCP


• Creating TCP socket, specifying IP creates a new socket for server process
address, port number of server process to communicate with that particular
• Server must have created socket (door) client
that welcomes client’s contact
– Allows server to talk with multiple clients
• Client TCP establishes connection to
server TCP

Application viewpoint:
TCP provides reliable, in-order byte-stream transfer (“pipe”) between client
and server.
22
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Structure [.1]

struct sockaddr
{
unsigned short int sa_family; // address family, AF_xxx
char sa_data[14] ; // 14 bytes of protocol address
}

• sa_family – this remains AF_INET for stream and datagram sockets


• sa_data - contains destination address and port number for the socket

23
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Structure [..2]
• Parallel structure to sockaddr
struct sockaddr_in
{
short int sin_family; // Address family (e.g., AF_INET)
unsigned short int sin_port; // Port number (e.g., htons (2240)
struct in_addr sin_addr; // Internet address
unsigned char sin_zero[8]; // same size as sockaddr
}
struct in_addr
{ unsigned long s_addr;
}
• sin_zero is used to pad the structure to the length of a structure sockaddr and hence is set to all zeros with
the function memset()
• Important – you can cast sockaddr_in to a pointer of type struct sockaddr and vice versa
• sin_family corresponds to sa_family and should be set to “AF_INET”.
• sin_port and sin_addr must be in NBO
24
Computer Networks (CS F303) BITS Pilani, Pilani Campus
NBO & HBO Conversion Functions

• Two types that can be converted


– short (two bytes)
– long (4-8 bytes)

• Primary conversion functions


– htons() // host to network short
– htonl() // host to network long
– ntohs // network to host short
– ntohl() // network to host long

• Very Important: Even if your machine is Big-Endian m/c, but you put your bytes in NBO
before putting them on to the network for portability

25
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Primary Socket System Calls

• socket() - create a new socket and return its descriptor


• bind() - associate a socket with a port and address
• listen() - establish queue for connection requests
• accept() - accept a connection request
• connect() - initiate a connection to a remote host
• recv() - receive data from a socket descriptor
• send() - send data to a socket descriptor
• close() - “one-way” close of a socket descriptor

26
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls: Connection-
Oriented (e.g., TCP)

27
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Programming with UDP

UDP: no “connection” between client & server


• No handshaking before sending data
• Sender explicitly attaches destination IP address and port # to
each packet
• Receiver extracts sender IP address and port# from received
packet

Note: Transmitted data may be lost or received out-of-order


28
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls:
Connectionless (e.g., UDP)

29
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls [.1]

• SOCKET: int socket(int domain, int type, int protocol);


– domain := AF_INET (IPv4 protocol)
– type := (SOCK_DGRAM or SOCK_STREAM )
– protocol := 0 (IPPROTO_UDP or IPPROTO_TCP)
– returned: socket descriptor (sockfd), -1 is an error

• BIND: int bind(int sockfd, struct sockaddr *my_addr, int addrlen);


– sockfd - socket descriptor (returned from socket())
– my_addr: socket address, struct sockaddr_in is used
– addrlen := sizeof(struct sockaddr)

30
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls [..2]

• LISTEN: int listen(int sockfd, int backlog);


– backlog: how many connections we want to queue

• ACCEPT: int accept(int sockfd, void *addr, int *addrlen);


– addr: here the socket-address of the caller will be written
– returned: a new socket descriptor (for the temporal socket)

• CONNECT: int connect(int sockfd, struct sockaddr *serv_addr, int


addrlen); //used by TCP client
– parameters are same as for bind()

31
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls […3]
• SEND: int send(int sockfd, const void *msg, int len, int flags);
– msg: message you want to send
– len: length of the message
– flags := 0
– returned: the number of bytes actually sent

• RECEIVE: int recv(int sockfd, void *buf, int len, unsigned int flags);
– buf: buffer to receive the message
– len: length of the buffer (“don’t give me more!”)
– flags := 0
– returned: the number of bytes received
32
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls [….4]
• SEND (DGRAM-style): int sendto(int sockfd, const void *msg, int len, int flags, const struct sockaddr *to, int
tolen);
– msg: message you want to send
– len: length of the message
– flags := 0
– to: socket address of the remote process
– tolen: = sizeof(struct sockaddr)
– returned: the number of bytes actually sent

• RECEIVE (DGRAM-style): int recvfrom(int sockfd, void *buf, int len, unsigned int flags, struct sockaddr
*from, int *fromlen);
– buf: buffer to receive the message
– len: length of the buffer (“don’t give me more!”)
– from: socket address of the process that sent the data
– fromlen:= sizeof(struct sockaddr)
– flags := 0
– returned: the number of bytes received

• CLOSE: close (socketfd);


33
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Byte ordering routines

#include <sys/types.h>
#include <netinet/in.h>

u_long htonl(u_long hostlong); /* host-to-network, long integer */

u_short htons(u_short hostshort); /* host-to-network, short integer */

u_long ntohl(u_long netlong); /* network-to-host, long integer */

u_short ntohs(u_short netshort); /* network-to-host, short integer */

Address conversion routines


#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

unsigned long inet_addr(char *ptr);


accepts a char string of IP address and returns a 32-bit network byte-order integer equivalent.
char *inet_ntoa(struct in_addr inaddr);
accepts an IP address expressed as a 32-bit quantity in network byte order and returns a string
expressed in dotted-decimal notation 34
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Simple TCP Server
#include <sys/types.h> listen(sockfd, 5);
#include <sys/socket.h>
#include <netinet/in.h> for(; ; ) {
#define SERVER_PORT 5888 clilen= sizeof(cliaddr);
connfd = accept(sockfd, (struct sockaddr *)
int main() &cliaddr, &clilen);
{ int sockfd,connfd,clilen,n;
if(connfd < 0)
char buf[256]; { printf(“Server Accept error \n”); exit(1); }
struct sockaddr_in servaddr, cliaddr;
printf("Client IP: %s\n",
sockfd = socket( AF_INET, SOCK_STREAM, 0); inet_ntoa(cliaddr.sin_addr));
if (sockfd < 0) printf("Client Port: %hu\n",
{ printf(“ Server socket error"); ntohs(cliaddr.sin_port));
exit(1); }
servaddr.sin_family = AF_INET; n = read(connfd, buf,256);
servaddr.sin_port = htons(SERVER_PORT); printf("Server read: \"%s\" [%d chars]\n", buf,
n);
servaddr.sin_addr.s_addr =
htonl(INADDR_ANY);
write(connfd, “Server Got Message”,n);
close(connfd);
if(bind(sockfd,(struct }
sockaddr*)&servaddr,sizeof(servaddr) <0 )
{ printf(“Server Bind Error”); exit(1); } }

35
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Simple TCP Client
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#define SERVER_PORT 5888
int main()
{ int sockfd, clifd,len;
char buf[256];
struct sockaddr_in servaddr;
sockfd = socket( AF_INET, SOCK_STREAM, 0);
if (sockfd < 0) { printf(“ Server socket error"); exit(1); }

servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(SERVER_PORT);
servaddr.sin_addr.s_addr = inet_addr(“172.24.2.4”);

connect(sockfd,(struct sockaddr*)&servaddr, sizeof(servaddr));

print(“Enter Message \n”);


fgets(buf,256,stdin);
write(sockfd, buf, strlen(buf));

read(sockfd, buf,256);
printf(“Client Received%s\n",buf);
close(sockfd);
}
36
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Simple UDP Server
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#define SERVER_PORT 9988
int main()
{ int sockfd, clilen;
char buf[256];
struct sockaddr_in servaddr, cliaddr;
sockfd = socket( AF_INET, SOCK_DGRAM, 0);
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(SERVER_PORT);
servaddr.sin_addr.s_addr =htonl(INADDR_ANY);
if (bind(sockfd,(struct sockaddr*)&servaddr,sizeof(servaddr)) <0 )
{ printf(“Server Bind Error”); exit(1); }
for(; ; )
{ clilen= sizeof(cliaddr);
recvfrom(sockfd,buf,256,0,(struct sockaddr*)&cliaddr,&clilen);

printf(“Server Received:%s\n”,buf);

sendto(sockfd,“Server Got Message",18, 0,(struct sockaddr*)&cliaddr,clilen);


}
}
37
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Simple UDP Client

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
printf(“Enter Message\n”);
#define SERVER_PORT 9988 fgets(buf,255,stdin);
#define SERVER_IPADDR “172.24.2.4” len= sizeof(server);
int main()
{ int sockfd,len; sendto(sockfd,buf,strlen(buf), 0,(struct
char buf[256]; sockaddr*)&servaddr,len);
struct sockaddr_in ,cliaddr,servaddr;
recvfrom(sockfd,buf,256,0,NULL,NULL);
servaddr.sin_family = AF_INET; printf(“Clinet Received: %s \n”,buf);
servaddr.sin_port = htons(SERVER_PORT); close(sockfd);
servaddr.sin_addr.s_addr = inet_addr(SERVER_IPADDR); }

sockfd = socket( AF_INET, SOCK_DGRAM, 0);

cliaddr.sin_family = AF_INET;
cliaddr.sin_port = htons(0);
cliaddr.sin_addr.s_addr =htonl(INADDR_ANY);
bind(sockfd,(struct sockaddr*)&cliaddr,sizeof(cliaddr));

38
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Thank You!

39
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Computer Networks (CS F303)
Virendra Singh Shekhawat
BITS Pilani Department of Computer Science and Information Systems
Pilani Campus
Topics

• Transport Layer
– Transport Layer Services
• Multiplexing/Demultiplexing
– Connectionless and Connection Oriented
» TCP and UDP
• Reliable data transfer
• Flow control
• Congestion control

2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Transport Layer Services and Protocols

• Provides logical communication between app


processes
– Apps processes sends msgs to each other using the
logical communication

• Extend host-to-host delivery to process-to-


process delivery

3
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TP Layer vs. Network Layer
• Network layer: logical communication between hosts

• TP Layer: logical communication between processes

• TP layer services are constrained by the service model of underlying


network-layer protocol

• But certain services can be offered by the TP layer even when the network
layer doesn’t offer
– e.g., Reliable data transfer

4
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Transport Layer Services

• Reliable in-order delivery (TCP)


– Congestion control
– Flow control
– Connection setup

• Unreliable, unordered delivery (UDP)


– Extension of “best-effort” IP

5
Computer Networks (CS F303) BITS Pilani, Pilani Campus
How a receiving host directs an incoming
segment to the appropriate socket?
multiplexing as sender: demultiplexing as receiver:
Handle data from multiple Use header info to deliver
sockets, add transport header received segments to correct
socket

application

application P1 P2 application socket


P3 transport P4
process
transport network transport
network link network
link physical link
physical physical

6
Computer Networks (CS F303) BITS Pilani, Pilani Campus
How demultiplexing works?

• Host receives IP datagrams 32 bits


 Each datagram has source IP address, source port # dest port #
destination IP address
 Each datagram carries one transport- other header fields
layer segment
 Each segment has source and
destination port numbers application
data
(payload)
• Host uses IP addresses & port numbers
to direct segment to appropriate socket TCP/UDP segment format

7
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Connectionless (UDP) Demultiplexing

• When host receives UDP segment:


– Checks destination port # in segment and directs segment to socket
with port #

• Recall: when creating datagram to send into UDP socket, must


specify
• Destination IP address
• Destination port #
8
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Connectionless Demultiplexing: an Example
DatagramSocket serverSocket
DatagramSocket = new DatagramSocket DatagramSocket
mySocket2 = new mySocket1 = new
(6428);
DatagramSocket DatagramSocket
(9157); application
(5775);
application application
P1
P3 P4
transport
transport transport
network
network link network
link physical link
physical physical

source port: 6428 source port: ?


dest port: 9157 dest port: ?

source port: 9157 source port: ?


dest port: 6428 dest port: ?
9
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Connection Oriented Demux

application P4 P5 P6 application
P3 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: IP physical
address B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: A,9157 source IP,port: C,5775 address C
dest IP,port: B,80
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80
Three segments, all destined to IP address: B,
dest port: 80 are demultiplexed to different sockets 10
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Connection Oriented Demultiplexing

• TCP socket identified by 4-tuple:


• Source IP address, source port #, dest IP address, dest port #

• Server host may support many simultaneous TCP sockets:


• Each socket identified by its own 4-tuple

• Web servers have different sockets for each connecting client


• e.g., non-persistent HTTP will have different socket for each request

11
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Example: Threaded Server
threaded server
application
application application
P4
P3 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: IP physical
address B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: A,9157 source IP,port: C,5775 address C
dest IP,port: B,80
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80
12
Computer Networks (CS F303) BITS Pilani, Pilani Campus
User Datagram Protocol [RFC 768]

• Best effort service


– UDP segment may lost, delivered out of order to app
• Connectionless
– No handshaking between sender and receiver

• Each UDP segment handled independently of others

13
Computer Networks (CS F303) BITS Pilani, Pilani Campus
UDP Segment Header
length, in bytes of
32 bits UDP segment,
source port # dest port # including header

length checksum
Why is there a UDP?
• No connection establishment
application
(which can add delay)
data
(payload) • Simple: no connection state
at sender, receiver
• Small header size
UDP segment format • No congestion control: UDP
can blast away as fast as
desired
14
Computer Networks (CS F303) BITS Pilani, Pilani Campus
UDP Checksum

• Treat segment contents (with header fields) as a sequence of 16-bit words


at sender
– One’s complement of the sum of all 16 bit words is put in checksum field.

• At the receiver, all 16-bit words are added (including checksum) to detect
error in segment.

15
Computer Networks (CS F303) BITS Pilani, Pilani Campus
User Datagram Protocol [RFC 768]

16
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Principles of Reliable Data Transfer

• Important in application, transport, link layers


• Top-10 list of important networking topics!

17
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Principles of Reliable Data Transfer

• Important in application, transport, link layers


• Top-10 list of important networking topics!

• Characteristics of unreliable channel will determine complexity of reliable data


transfer protocol (rdt) 18
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Reliable Data Transfer: getting started

rdt_send(): called from above, deliver_data(): called by


(e.g., by app.). Passed data to be rdt to deliver data to upper
delivered to receiver upper layer

send receive
side side

udt_send(): called by rdt, rdt_rcv(): called when packet


to transfer packet over arrives on rcv-side of channel
unreliable channel to receiver
19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Reliable Data Transfer: getting started

We will:
• Incrementally develop sender, receiver sides of reliable data
transfer protocol (rdt)
• Consider only unidirectional data transfer
– But control info will flow on both directions!
• Use finite state machines (FSM) to specify sender, receiver
event causing state transition
actions taken on state transition
State: when in this “state”
next state uniquely state state
1 event
determined by next 2
event actions

20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt1.0: reliable transfer over a reliable channel

• Underlying channel perfectly reliable


– No bit errors, No loss of packets
• Separate FSMs for sender, receiver:
– Sender sends data into underlying channel
– Receiver read data from underlying channel

Wait for rdt_send(data) Wait for rdt_rcv(packet)


call from packet = make_pkt(data) call from extract (packet,data)
above rdt_send(packet) below deliver_data(data)

sender receiver
21
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0: channel with bit errors
• Underlying channel may flip bits in packet
– Don’t worry… Checksum is there to detect bit errors

• The question? How to recover from errors?


– Acknowledgements (ACKs): receiver explicitly tells sender that pkt received OK
– Negative acknowledgements (NAKs): receiver explicitly tells sender that pkt had errors
– Sender retransmits pkt on receipt of NAK

• New mechanisms in rdt2.0 (beyond rdt1.0):


– Error detection
– Receiver feedback: control msgs (ACK,NAK) rcvr->sender 22
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0: FSM Specification
rdt_send(data)
sndpkt = make_pkt(data, checksum) receiver
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for rdt_rcv(rcvpkt) &&
Wait for call
ACK or udt_send(sndpkt) corrupt(rcvpkt)
from above
NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for call
L
from below
sender
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
23
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0: Operation with no Errors
rdt_send(data)
sndpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
L call from
below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
24
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0: Error Scenario
rdt_send(data)
sndpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for call Wait for rdt_rcv(rcvpkt) &&
from above ACK or udt_send(sndpkt) corrupt(rcvpkt)
NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for call
L from below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
25
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0 Has a fatal flaw!

• What happens if ACK/NAK corrupted?


– Sender doesn’t know what happened at receiver!
– Simple, just retransmit. (Is there any issue here!!!)

• How to handle duplicates?


– Sender adds sequence number to each pkt
– Receiver discards (doesn’t deliver up) duplicate pkt

26
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.1: Sender, handles garbled ACK/NAKs

rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
(corrupt(rcvpkt) ||
Wait for Wait for isNAK(rcvpkt) )
call 0 from ACK or
NAK 0 udt_send(sndpkt)
above
rdt_rcv(rcvpkt)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) &&
&& notcorrupt(rcvpkt)
isACK(rcvpkt)
&& isACK(rcvpkt)
L
L
Wait for Wait for
ACK or call 1 from
rdt_rcv(rcvpkt) NAK 1 above
&& (corrupt(rcvpkt) ||
isNAK(rcvpkt) ) rdt_send(data)

udt_send(sndpkt) sndpkt = make_pkt(1, data, checksum)


udt_send(sndpkt)

27
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.1: Receiver, handles garbled ACK/NAKs

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)


&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum) sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
Wait for Wait for
rdt_rcv(rcvpkt) && 0 from 1 from rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) && below below not corrupt(rcvpkt) &&
has_seq1(rcvpkt) has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, chksum) sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)

extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
28
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.1: Discussion

Sender:
• Seq # added to pkt
• Two seq. #’s (0,1) will suffice. Why?
• Must check if received ACK/NAK corrupted
• Twice as many states
– State must “remember” whether “current” pkt has 0 or 1 seq. #

Receiver:
• Must check if received packet is duplicate
– State indicates whether 0 or 1 is expected pkt seq #
– For an out of order received packet, it sends ACK for it

29
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.2: NAK Free Protocol

• Same functionality as rdt2.1, using ACKs only


• Instead of NAK, receiver sends ACK for last pkt received OK
– Receiver must explicitly include seq # of pkt being ACKed
30
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0: Channels with errors and loss

• New assumption: Underlying channel can also lose packets (data or ACKs)
– Checksum, seq. #, ACKs, retransmissions will be of help, but not enough. Why???

• Approach: Sender waits “reasonable” amount of time for ACK


– Retransmits if no ACK received in this time
– If pkt (or ACK) just delayed (not lost):
• Retransmission will be duplicate, but use of seq. #’s already handles this
• Receiver must specify seq # of pkt being ACKed
timeout
– Requires countdown timer

31
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt 3.0 Sender
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
start_timer

Wait for Wait


call 0 from for
above ACK0
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt,1) && notcorrupt(rcvpkt)
stop_timer && isACK(rcvpkt,0)
stop_timer
Wait Wait for
for call 1 from
ACK1 above

rdt_send(data)
sndpkt = make_pkt(1, data, checksum)
udt_send(sndpkt)
start_timer
32
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt 3.0 Sender
rdt_send(data)
rdt_rcv(rcvpkt) &&
sndpkt = make_pkt(0, data, checksum) ( corrupt(rcvpkt) ||
udt_send(sndpkt) isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) start_timer L
L Wait for Wait
for timeout
call 0 from
ACK0 udt_send(sndpkt)
above
start_timer
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt,1) && notcorrupt(rcvpkt)
stop_timer && isACK(rcvpkt,0)
stop_timer
Wait Wait for
timeout for call 1 from
udt_send(sndpkt) ACK1 above
start_timer rdt_rcv(rcvpkt)
rdt_send(data) L
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) || sndpkt = make_pkt(1, data, checksum)
isACK(rcvpkt,0) ) udt_send(sndpkt)
start_timer
L
33
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt 3.0 Receiver

34
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt 3.0 Sender and Receiver

35
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0 – No Loss

sender receiver
send pkt0 pkt0
rcv pkt0
ack0 send ack0
rcv ack0
send pkt1 pkt1
rcv pkt1
ack1 send ack1
rcv ack1
send pkt0 pkt0
rcv pkt0
ack0 send ack0

36
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0 – Data packet Loss
sender receiver
send pkt0 pkt0
rcv pkt0
ack0 send ack0
rcv ack0
send pkt1 pkt1
X
loss

timeout
resend pkt1 pkt1
rcv pkt1
ack1 send ack1
rcv ack1
send pkt0 pkt0
rcv pkt0
ack0 send ack0

37
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0 – ACK Loss

sender receiver
send pkt0 pkt0
rcv pkt0
ack0 send ack0
rcv ack0
send pkt1 pkt1
rcv pkt1
ack1 send ack1
X
loss
timeout
resend pkt1 pkt1
rcv pkt1
(detect duplicate)
ack1 send ack1
rcv ack1
send pkt0 pkt0
rcv pkt0
ack0 send ack0

38
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0 – Premature Timeout / Delayed ACK
sender receiver
send pkt0
pkt0
rcv pkt0
send ack0
ack0
rcv ack0
send pkt1 pkt1
rcv pkt1
send ack1
ack1
timeout
resend pkt1
pkt1 rcv pkt1
rcv ack1 (detect duplicate)
send pkt0 pkt0 send ack1
ack1 rcv pkt0
rcv ack1 send ack0
(ignore) ack0

pkt1

39
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Rdt3.0 Stop and Wait Performance
 Example: 1 Gbps link, 15 ms prop. delay, 8000 bit packet sender receiver

L/R L/R
Usender=
RTT + L / R
.008 RTT
=
30.008
= 0.00027

 rdt 3.0 protocol performance stinks!


 Protocol limits performance of underlying infrastructure (channel)
40
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Pipelined Protocols

• Pipelining: Sender allows multiple, “in-flight”, yet-to-be-acked pkts


– Range of sequence numbers must be increased
– Buffering at sender and/or receiver

41
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Pipelining: Increased Utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R

Increase utilization
by a factor of 3!

U 3*L/R .024
= = = 0.0008
sender 30.008
RTT + L / R microsecon
ds
42
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Pipelining Protocols Requirements

• The range of sequence numbers must be increased


– Multiple in-transit packets

• Packet Buffering is required at both sides. Why?

• Two basic approaches


– Go-Back-N (GBN)
– Selective Repeat (SR)

43
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Go-Back-N in Action
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
012345678 send pkt2 receive pkt0, send ack0
012345678 send pkt3 Xloss receive pkt1, send ack1
(wait)
receive pkt3, discard,
012345678 rcv ack0, send pkt4 (re)send ack1
012345678 rcv ack1, send pkt5 receive pkt4, discard,
(re)send ack1
ignore duplicate ACK receive pkt5, discard,
(re)send ack1
pkt 2 timeout
012345678 send pkt2
012345678 send pkt3
012345678 send pkt4 rcv pkt2, deliver, send ack2
012345678 send pkt5 rcv pkt3, deliver, send ack3
rcv pkt4, deliver, send ack4
rcv pkt5, deliver, send ack5
44
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Go-Back-N: Sender
 Sender: “window” of up to N consecutive transmitted but unACKed pkts
• k-bit seq # in pkt header

 Cumulative ACK: ACK(n): ACKs all packets up to, including seq # n


• On receiving ACK(n): move window forward to begin at n+1
 Timer for oldest in-flight packet
 Timeout(n): retransmit packet n and all higher seq # packets in
window
45
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Go-Back-N: Receiver
 ACK-only: always send ACK for correctly-received packet so far,
with highest in-order seq #
• May generate duplicate ACKs
• Need only to remember rcv_base
 On receipt of out-of-order packet:
• Discard (don’t buffer)
• Re-ACK pkt with highest in-order seq #

Receiver view of sequence number space:


received and ACKed

… …
rcv_base
Not received
46
Computer Networks (CS F303) BITS Pilani, Pilani Campus
GBN Sender FSM
rdt_send(data)
if (nextseqnum < base+N) {/*If we are allowed to send packets*/
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send (sndpkt [nextseqnum] )
if (base == nextseqnum) /*If there are no packets in flight*/
start_timer
nextseqnum++
}
L else
refuse_data(data)
base=0
nextseqnum=0
timeout
start_timer
Wait
udt_send(sndpkt[base])
rdt_rcv(rcvpkt) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnum-1])
L
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1 /*Increase left size of the window*/
If (base == nextseqnum)
stop_timer
else
start_timer 47
Computer Networks (CS F303) BITS Pilani, Pilani Campus
GBN Receiver FSM

• Always send ACK for correctly-received pkt with highest in-order seq #
– Need only to remember “expectedseqnum”
• If out-of-order pkt arrived
– Discard it
– Re-ACK pkt with the highest in-order seq #
default
udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
L && hasseqnum(rcvpkt,expectedseqnum)
expectedseqnum=0 Wait extract(rcvpkt,data)
sndpkt = deliver_data(data)
make_pkt(expectedseqnum,ACK,chksum) sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++
48
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Selective Repeat Protocol

• Receiver individually acknowledges all correctly received pkts


– Buffers pkts, as needed, for eventual in-order delivery to upper layer

• Sender only resends pkts for which ACK not received


– Sender timer for each unACKed pkt

• Sender maintains timer for each unacked packet


– when timer expires, retransmit only that unacked packet

49
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Selective Repeat in Action
sender window (N=4)
012345678 send pkt0
012345678 send pkt1
012345678 send pkt2 receive pkt0, send ack0
012345678 send pkt3 Xloss receive pkt1, send ack1
0123 (wait)
receive pkt3, buffer,
012345678 rcv ack0, send pkt4 send ack3
012345678 rcv ack1, send pkt5 receive pkt4, buffer,
send ack4
record ack3 arrived receive pkt5, buffer,
send ack5
pkt 2 timeout
012345678 send pkt2
012345678 record ack4 arrived
012345678 rcv pkt2; deliver pkt2,
record ack4 arrived
012345678 pkt3, pkt4, pkt5; send ack2

Q: What happens when ack2 arrives?


50
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Selective Repeat Protocol: Windows

Events at Sender
 Data from above
 Timeout
 ACK(n) in [sendbase,sendbase+N-1]

Events at Receiver
 Pkt n in [rcvbase, rcvbase+N-1]
 Pkt n in [rcvbase-N,rcvbase-1]
 Pkt rcvd and not fall in above
ranges
51
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Selective Repeat: A Dilemma!
sender window receiver window
(after receipt) (after receipt)
Example:
pkt0
 Seq #s: 0, 1, 2, 3 (base 4 counting) 0123012
pkt1 0123012
0123012
 Window size=3 0123012 pkt2 0123012
0123012
0123012 pkt3
X
0123012
pkt0 will accept packet
with seq number 0
0123012 pkt0
0123012 pkt1 0123012
0123012 pkt2 X 0123012
X 0123012
X
timeout
retransmit pkt0
0123012 pkt0
will accept packet
with seq number 0

52
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Thank You!

53
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Computer Networks (CS F303)
BITS Pilani Virendra Singh Shekhawat
Department of Computer Science and Information Systems
Pilani Campus
Topics

• Transport Layer
– TCP Protocol
• Connection Establishment
• TCP Segment Structure
• Reliable data transfer
• Flow control
• Congestion control

2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP [RFCs: 793,1122,1223,2018,2581]

• Point to Point protocol


– One sender and one receiver
• Reliable in-order byte stream
– No message boundaries
• Pipelined
– Window size is set by congestion and flow control
• Full duplex data
– Bi-directional data flow in same connection
• Connection oriented
– Handshaking (exchange of control msgs to initialize sender, receiver state before data
exchange)
• Flow controlled
– Sender do not overwhelm receiver
3
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Segment Structure
32 bits

source port # dest port # seq#: byte stream number


ACK: seq # of next expected sequence number of the first byte in the
byte; A bit: this is an ACK segment (NOT segment
acknowledgement number
number!)
head not
length (of TCP header) len used C EUAP R SF receive window flow control: # bytes
Internet checksum checksum Urg data pointer receiver willing to accept

options (variable length)


C, E: congestion notification
TCP options
application data sent by
RST, SYN, FIN: connection data application into
management (variable length) TCP socket

4
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP: Wireshark Capture

5
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Connection Management

• Before exchanging data, sender/receiver do “handshake”


– Agree on connection parameters

application application

connection state: ESTAB connection state: ESTAB


connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
at server, client at server, client

network network

Socket clientSocket = Socket connectionSocket =


newSocket("hostname","port welcomeSocket.accept();
number");
6
Computer Networks (CS F303) BITS Pilani, Pilani Campus
2-way Handshake

• Will 2-way handshake


always work in network?
Let’s talk
ESTAB
OK
ESTAB

choose x
req_conn(x)
ESTAB
acc_conn(x)
ESTAB

7
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP 3-way Handshake

client state server state


LISTEN LISTEN
choose init seq num, x
send TCP SYN msg
SYNSENT SYNbit=1, Seq=x
choose init seq num, y
send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
received SYNACK(x)
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1, ACKnum=y+1
client-to-server data
received ACK(y)
indicates client is live
ESTAB

8
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Retransmission Scenarios
Host A Host B Host A Host B

SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

timeout
ACK=100
X
ACK=100
ACK=120

Seq=92, 8 bytes of data Seq=92, 8


SendBase=100 bytes of data send cumulative
SendBase=120 ACK for 120
ACK=100
ACK=120

SendBase=120

lost ACK scenario premature timeout


9
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Retransmission Scenarios
Host A Host B

Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

ACK=100
X
ACK=120

Seq=120, 15 bytes of data

Cumulative ACK covers


for earlier lost ACK
10
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Sequence Numbers and ACKs
outgoing segment from sender
Sequence Numbers: source port # dest port #
sequence number
• Byte stream “number” of acknowledgement number
rwnd
first byte in segment’s data checksum urg pointer

window size
Acknowledgements: N

• seq # of next byte expected


from other side sender sequence number space

sent sent, not- usable not


ACKed yet ACKed but not usable
(“in-flight”) yet sent

outgoing segment from receiver


source port # dest port #
sequence number
acknowledgement number
A rwnd
checksum urg pointer 11
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Sender
Event: Data received from application Event: Timeout
 Create segment with seq #  Retransmit segment that caused
timeout
 Seq # is byte-stream number of first  Restart timer
data byte in segment
 Start timer if not already running Event: ACK received
• Think of timer as for oldest unACKed
segment  If ACK acknowledges
• Expiration interval: previously unACKed segments
TimeOutInterval • Update what is known to be
ACKed
• Start timer if there are still
unACKed segments
12
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Receiver: ACK generation [RFC 5681]
Event at receiver TCP receiver action
arrival of in-order segment with delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

arrival of in-order segment with immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

13
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Is TCP GBN or SR…?

1. Is out of order segments are individually ACKed?


2. Are ACKs cumulative?
3. How many timers are maintained by sender?
4. Is TCP receiving out of order segments?

14
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Timeout

• How to set TCP Timeout value?


– Must be longer than RTT
– Too short vs. too long

• How to estimate RTT?


– RTT: measured time from segment transmission until ACK receipt

15
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Exercise
• Suppose Host A and Host B are using TCP protocol to communicate with
each other. After the connection establishment process, Host A sends
three data segments to Host B in sequence.
– Sequence # = 100, Segment length = 100 Bytes
– Sequence # = 200, Segment length = 200 Bytes
– Sequence # = 400, Segment length = 400 Bytes
• Assume that only third segment is arrived at Host B and other two
segments have been dropped in between.

• a) What action will be taken by Host B, when the third segment is arrived?

• b) Now assume a timeout event occurs at Host A. What action will be


taken by Host A?

16
Computer Networks (CS F303) BITS Pilani, Pilani Campus
RTT Estimation
EstimatedRTT = (1-)*EstimatedRTT + *SampleRTT
– Influence of past sample decreases exponentially fast
– Typical value of  = 0.125 RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

350

300

250

RTT (milliseconds)
200

150

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)

SampleRTT Estimated RTT

17
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Timeout Interval

• Timeout Interval
– Estimated RTT + “Safety margin”
– Large variation in Estimated RTT  large safety margin
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically,  = 0.25)

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

18
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Excercise: Timeout Interval Calculation

• Consider three RTT samples (in ms): 150, 200 and 210 in that
order. Assume initial estimated RTT= 200 ms, initial DevRTT = 50
ms, β = 0.25 and α = 0.125

19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Connection Close

client state server state


ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1
can still
FIN_WAIT_2 wait for server send data
close

LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime

CLOSED
20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Connection States-Client and Server

21
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Thank You

22
Computer Networks (CS F303) BITS Pilani, Pilani Campus

You might also like