01 Network Introduction
01 Network Introduction
https://inventiontourblog.wordpress.com/2015/03/31/internet-advanced-rese
arch-project-agency-arpa-develops-the-first-computer-network/
3
Introduction – Why TCP/IP ?
● The gap between applications and Network
○ Network Applications
■ 802.3 Ethernet Libraries
■ 802.4 Token bus
■ 802.5 Token Ring Linux kernel
■ 802.11 Wireless
High-level abstractions
■ 802.16 WiMAX Network
○ Application File-systems
protocols
■ Reliable Low-level interfaces
■ Performance
Hardware
5
Introduction – Layers of TCP/IP (2)
● Each layer has several protocols User User User User
Applicatio
○ A layer define a data Process Process Process Process n
layer
media
6
Introduction – Layers of TCP/IP (2)
● ISO/OSI Model (International Organization for Standardization /
Open System Interconnection Reference Model)
● TCP/IP Model OSI Model TCP/IP
Application
Presentation Application
Session
Transport Transport
Network Internet
Data-link
Network
Physical Interface
8
Introduction – Addressing
● Addressing
○ MAC Address
■ Media Access Control Address
■ 48-bit Network Interface Card Hardware Address
● 24-bit manufacture ID
● 24-bit serial number
■ Ex:
● 00:07:e9:10:e6:6b
○ IP Address
■ 32-bit Internet Address (IPv4)
■ Ex:
● 140.113.209.64
○ Port
■ 16-bit uniquely identify application (1 ~ 65536)
■ Ex:
● FTP port 21, SSH port 22, Telnet port 23
9
Link Layer
國立陽明交通大學資工系資訊中心
Computer Center of Department of Computer Science, NYCU
10
Link Layer – Introduction of Link Layer
● Purpose of the link layer
○ Send and receive IP datagram for IP module
○ ARP request and reply
○ RARP request and reply
● TCP/IP support various link layers, depending on the type of
hardware used:
○ Ethernet
■ Teach in this class
○ Token Ring
○ FDDI (Fiber Distributed Data Interface)
○ Serial Line
11
Link Layer – Ethernet
● Features
○ Predominant form of local LAN technology used today
○ Use CSMA/CD
■ Carrier Sense, Multiple Access with Collision Detection
○ Use 48-bit MAC address
○ Operate at 10 Mbps
■ Fast Ethernet at 100 Mbps
■ Gigabit Ethernet at 1000 Mbps
■ 10 Gigabit Ethernet at 10,000 Mbps (10Gbps)
○ Ethernet frame format is defined in RFC 894
■ This is the actually used format in reality
12
Link Layer – Ethernet Frame Format
● 48-bit hardware address
○ For both destination and source address
● 16-bit type is used to specify the type of following data
○ 0800 IP datagram
○ 0806 ARP, 8035 RARP
46-1500 byte
Ethernet Encapsulation(RFC 894)
destination source
type data CRC
addr addr
6 6 2 46-1500
type
IP datagram
0800
2 46-1500
type
ARP request/reply PAD
0806
2 28 18
type
RARP request/reply PAD
0835
2 28 18 13
Link Layer – Loopback Interface
● Pseudo NIC
○ Allow client and server on the same host to communicate with each
other using TCP/IP
○ IP IP output
function
IP input
function
■ 127.0.0.1
○ Hostname place on IP Yes
destination IP address
equal broadcast address
place on IP
input queue
input queue
or multicast address?
■ localhost loopback driver No
Ethernet driver
Yes destination IP address
equal interface IP address?
No, use ARP to get destination
Ethernet address IP
demultiplex based on
ARP
Ethernet frame type
Send ARP Receive
Ethernet 14
Link Layer – MTU Hyperchannel
Network MTU (bytes)
65536
○ Limit size of payload part of Ethernet frame 4 Mbits/sec token ring (IEEE
802.5)
4464
○ Smallest MTU of any data link MTU between the two hosts
○ Depend on route
15
Link Layer – MTU
● To get MTU info
$ ifconfig
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 9000
options=b<RXCSUM,TXCSUM,VLAN_MTU>
inet 192.168.7.1 netmask 0xffffff00 broadcast 192.168.7.255
ether 00:0e:0c:01:d7:c8
media: Ethernet autoselect (1000baseTX <full-duplex>)
status: active
fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
options=b<RXCSUM,TXCSUM,VLAN_MTU>
inet 140.113.17.24 netmask 0xffffff00 broadcast 140.113.17.255
ether 00:02:b3:99:3e:71
media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
16
Network Layer
國立陽明交通大學資工系資訊中心
Computer Center of Department of Computer Science, NYCU
17
Network Layer – Introduction to Network Layer
● Unreliable and connectionless datagram delivery service
○ IP Routing
○ IP provides best effort service (unreliable)
○ IP datagram can be delivered out of order (connectionless)
● Protocols using IP
○ TCP, UDP, ICMP, IGMP
18
Network Layer – IP Header
● 20 bytes in total length, except options
0 15 16 31
4-bit 4-bit header 8-bit type of service
16-bit total length (in bytes)
version length (TOS)
3-bit
16-bit identification 13-bit fragment offset
flags
8-bit time to live
8-bit protocol 16-bit header checksum 20 bytes
(TTL)
data
19
The Network Layer – IP Address
● 32-bit long ● E.g.,
○ Network part ○ NCTU
■ Identify a logical network ■ Class B address: 140.113.0.0
■ Network ID: 140.113
○ Host part
■ Number of hosts: 256*256 = 65536
■ Identify a machine on certain network
● IP address category
Class 1st byte Format Comments
A 1-126 N.H.H.H Very early networks, or reserved for DOD
B 128-191 N.N.H.H Large sites, usually subnetted, were to get
C 192-223 N.N.N.H Easy to get, often obtained in sets
D 224-239 - Multicast addresses, not permanently assigned
E 240-254 - Experimental addresses
20
Network Layer – Subnetting, CIDR, and Netmask (1)
● Problems of Class A or B network
○ Number of hosts is enormous
○ Hard to maintain and management
○ Solution => Subnetting
● Problems of Class C network
○ 255*255*255 number of Class C network make the size of Internet
routes huge
○ Solution => Classless Inter-Domain Routing
21
Network Layer – Subnetting, CIDR, and Netmask (2)
● Subnetting
○ Borrow some bits from network ID to extends hosts ID
○ E.g.,
■ Class B address : 140.113.0.0
= 256 Class C-like IP addresses
in N.N.N.H subnetting method
■ 140.113.209.0 subnet
● Benefits of subnetting
○ Reduce the routing table size of Internet routers
○ Ex:
■ All external routers have only one entry for 140.113 Class B network
22
Network Layer – Subnetting, CIDR, and Netmask (3)
● Netmask
○ Specify how many bits of network-ID are used for network-ID
○ Continuous 1 bits form the network part
○ E.g.,
■ 255.255.255.0 in NCTU-CS example
● 256 hosts available
■ 255.255.255.248 in ADSL example
● Only 8 hosts available
○ Shorthand notation
■ Address/prefix-length
● Ex: 140.113.209.8/24
23
Network Layer – Subnetting, CIDR, and Netmask (4)
● How to determine your network ID?
○ Bitwise-AND IP and netmask
○ E.g.,
○ 140.113.214.37 & 255.255.255.0 => 140.113.214.0
○ 140.113.209.37 & 255.255.255.0 => 140.113.209.0
25
Network Layer – Subnetting, CIDR, and Netmask (6)
● The smallest subnetting
○ Network portion : 30 bits
○ Host portion : 2 bits
=> 4 hosts, but only 2 IPs are available
● ipcalc
○ $ pkg install ipcalc
○ /usr/ports/net-mgmt/ipcalc
$ ipcalc 140.113.235.100/28
dest IP = 140.252.13.33
Ex Routing table:
140.252.13.33 00:d0:59:83:d9:16 UHLW fxp1
31
Network Layer – IP Routing (4)
● Ex2:
dest Enet = Enet of 140.252.1.4
○ routing across multi-network dest IP = 192.48.96.9
next hop =
140.252.104.2 (default)
gateway link hdr IP hdr
.1.4
Ethernet, 140.252.1 .1.183
next hop =
netb 140.252.1.4 (default)
modem
SLIP IP hdr
dest IP = 192.48.96.9
modem
.1.29
next hop = next hop =
140.252.13.33 (default)
bsdi sun 140.252.1.183 (default)
.13.35 .13.33
Ethernet, 140.252.13
dest IP = 192.48.96.9
dest Enet = Enet of 140.252.13.33 32
ARP and RARP
Something between MAC (link layer) And IP (network layer)
國立陽明交通大學資工系資訊中心
Computer Center of Department of Computer Science, NYCU
33
ARP and RARP
● ARP – Address Resolution Protocol and
RARP – Reverse ARP
○ Mapping between IP and Ethernet address
● When an Ethernet frame is sent on LAN from one host to another,
○ It is the 48-bit Ethernet address that determines for which interface
the frame is destined
34
ARP and RARP – ARP Example hostname
hostname
resolver (1)
FTP
IP addr
● Example (2) establish connection
○ % ftp bsd1 with IP address
Ethernet Ethernet
Driver Driver
(7)
ARP ARP IP
TCP 35
ARP and RARP – ARP Cache
● Maintain recent ARP results
○ Come from both ARP request and reply
○ Expiration time
■ Complete entry = 20 minutes
■ Incomplete entry = 3 minutes
○ Use arp command to see the cache
○ E.g.:
■ $ arp -a
■ $ arp -da
■ $ arp -S 140.113.235.132 00:0e:a6:94:24:6e
$ arp -a
crypto23.csie.nctu.edu.tw (140.113.208.143) at 00:16:e6:5b:fa:e9 on fxp1 [ethernet]
e3rtn-208.csie.nctu.edu.tw (140.113.208.254) at 00:0e:38:a4:c2:00 on fxp1 [ethernet]
e3rtn-210.csie.nctu.edu.tw (140.113.210.254) at 00:0e:38:a4:c2:00 on fxp2 [ethernet]
36
ARP and RARP – ARP/RARP Packet Format
● Ethernet destination addr: all 1’s (broadcast)
● Known value for IP <-> Ethernet
○ Frame type: 0x0806 for ARP, 0x8035 for RARP
○ Hardware type: type of hardware address (1 for Ethernet)
○ Protocol type: type of upper layer address (0x0800 for IP)
○ Hard size: size in bytes of hardware address (6 for Ethernet)
○ Protocol size: size in bytes of upper layer address (4 for IP)
○ Op: 1, 2, 3, 4 for ARP request, reply, RARP request, reply
hard size
prot size
Ethernet Ethernet frame hard prot sender sender target target
type op
destination addr source addr type type Ethernet addr IP addr Ethernet addr IP addr
6 6 2 2 2 1 1 2 6 4 6 4
Ethernet header 28 byte ARP request/reply
37
ARP and RARP – Use tcpdump to see ARP
● Host 140.113.17.212 => 140.113.17.215
○ Clear ARP cache of 140.113.17.212
■ $ sudo arp -d 140.113.17.215
○ Run tcpdump on 140.113.17.215 (00:11:d8:06:1e:81)
■ $ sudo tcpdump -i sk0 -e arp
■ $ sudo tcpdump -i sk0 -n -e arp
■ $ sudo tcpdump -i sk0 -n -t -e arp
○ On 140.113.17.212, ssh to 140.113.17.215
15:18:54.899779 00:90:96:23:8f:7d > Broadcast, ethertype ARP (0x0806), length 60:
arp who-has nabsd tell chbsd.csie.nctu.edu.tw
15:18:54.899792 00:11:d8:06:1e:81 > 00:90:96:23:8f:7d, ethertype ARP (0x0806), length 42:
arp reply nabsd is-at 00:11:d8:06:1e:81
modem
SLIP (dualup)
modem
140.252.1.29
SLIP
slip .65 .66 bsdi sun svr4
.35 .33 .34
Ethernet, 140.252.13
39
ARP and RARP – Gratuitous ARP
● Gratuitous ARP
○ The host sends an ARP request looking for its own IP
○ Provide two features
■ Used to determine whether there is another host configured with the
same IP
■ Used to cause any other host to update ARP cache when changing
hardware address
40
ARP and RARP – RARP
● Principle
○ Used for the diskless system to read its hardware address from the
NIC and send an RARP request to gain its IP
● RARP Server Design
○ RARP server must maintain the map from hardware address to an IP
address for many host
○ Link-layer broadcast
■ This prevent most routers from forwarding an RARP request
41
ICMP
Internet Control Message Protocol
國立陽明交通大學資工系資訊中心
Computer Center of Department of Computer Science, NYCU
42
ICMP – Introduction
● Part of the IP layer
○ ICMP messages are transmitted within IP datagram
○ ICMP communicates error messages and other conditions that require
attention for other protocols
● ICMP message format
IP datagram
43
ICMP – Message Type (1)
type code Description Query Error type code Description Query Error
44
ICMP – Message Type (2)
type code Description Query Error type code Description Query Error
11 time exceeded:
45
ICMP – Query Message
– Address Mask Request/Reply (1)
● Address Mask Request and Reply
○ Used for diskless system to obtain its subnet mask
○ Identifier and sequence number
■ Can be set to anything for sender to match reply with request
○ The receiver will response an ICMP reply with the subnet mask of the
receiving NIC
0 7 8 15 16 31
address mask
46
ICMP – Query Message
– Address Mask Request/Reply (2)
● Example:
$ ping -M m sun1.cs.nctu.edu.tw
ICMP_MASKREQ
PING sun1.cs.nctu.edu.tw (140.113.235.171): 56 data bytes
68 bytes from 140.113.235.171: icmp_seq=0 ttl=251 time=0.663 ms mask=255.255.255.0
68 bytes from 140.113.235.171: icmp_seq=1 ttl=251 time=1.018 ms mask=255.255.255.0
68 bytes from 140.113.235.171: icmp_seq=2 ttl=251 time=1.028 ms mask=255.255.255.0
68 bytes from 140.113.235.171: icmp_seq=3 ttl=251 time=1.026 ms mask=255.255.255.0
^C
--- sun1.cs.nctu.edu.tw ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.663/0.934/1.028/0.156 ms
$ icmpquery -m sun1
sun1 : 0xFFFFFF00
47
ICMP – Query Message
– Timestamp Request/Reply (1)
● Timestamp request and reply
○ Allow a system to query another for the current time
○ Milliseconds resolution, since midnight UTC
○ Requestor
■ Fill in the originate timestamp and send
○ Reply system
■ Fill in the receive timestamp when it receives the request and the transmit time
when it sends the reply
0 7 8 15 16 31
type (13 or 14) code (0) checksum
identifier sequence number
originate timestamp
receive timestamp
transmit timestamp
48
ICMP – Query Message
– Timestamp Request/Reply (2)
● Example
$ ping -M time nabsd
ICMP_TSTAMP
PING nabsd.cs.nctu.edu.tw (140.113.17.215): 56 data bytes
76 bytes from 140.113.17.215: icmp_seq=0 ttl=64 time=0.663 ms
tso=06:47:46 tsr=06:48:24 tst=06:48:24
76 bytes from 140.113.17.215: icmp_seq=1 ttl=64 time=1.016 ms
tso=06:47:47 tsr=06:48:25 tst=06:48:25
$ icmpquery -t nabsd
nabsd : 14:54:47
0 7 8 15 16 31
50
ICMP – Error Message
– Port Unreachable (1)
● ICMP port unreachable
○ Type = 3 , code = 3
○ Host receives a UDP datagram but the destination port does not
correspond to a port that some process has in use
IP datagram
ICMP message
51
ICMP – Error Message
– Port Unreachable (2)
● Example:
○ Using TFTP (Trivial File Transfer Protocol)
■ Original port: 69
$ tftp
tftp> connect localhost 8888
tftp> get temp.foo
Transfer timed out.
tftp>
optional data
53
ICMP – Ping Program (2)
● Ex:
○ ServerA ping ServerB
○ execute “tcpdump -i sk0 -X -e icmp” on ServerB
ServerA $ ping ServerB
PING ServerB.cs.nctu.edu.tw (140.113.17.215): 56 data bytes
64 bytes from 140.113.17.215: icmp_seq=0 ttl=64 time=0.520 ms
● To get the route that packets take to host 32-bit destination IP address
○ Command: ping -R
data
○ Cause every router that handles the datagram to add its (outgoing) IP address to a
list in the options field.
○ Format of Option field for IP RR Option IP datagram
57
Traceroute Program (1)
● To print the route packets take to network host
● Drawbacks of IP RR options (ping -R)
○ Not all routers have supported the IP RR option
○ Limitation of IP header length
● Background knowledge of traceroute
○ When a router receive a datagram, it will decrement the TTL by one
○ When a router receive a datagram with TTL = 0 or 1,
■ it will through away the datagram and
■ sends back a “Time exceeded” ICMP message
○ Unused UDP port will generate a “port unreachable” ICMP message
58
Traceroute Program (2)
● Operation of traceroute
○ Send UDP with port > 30000, encapsulated with IP header with TTL = 1, 2, 3, …
continuously
○ When router receives the datagram and TTL = 1, it returns a “Time exceeded” ICMP
message
○ When destination host receives the datagram and TTL = 1, it returns a “Port
unreachable” ICMP message
IP (TTL=1)
source destination
router router
host host
ICMP
(time exceeded)
IP (TTL=2) IP (TTL=1)
source destination
router router
host host
ICMP ICMP
(time exceeded) (time exceeded)
IP (TTL=3) IP (TTL=2) IP (TTL=1)
source destination
router router
host host
ICMP ICMP ICMP
(port unreachable) (port unreachable) (port unreachable)
59
Traceroute Program (3)
● Time exceed ICMP message
○ Type = 11, code = 0 or 1
■ Code = 0 means TTL=0 during transit
■ Code = 1 means TTL=0 during reassembly
○ First 8 bytes of datagram
■ UDP header
0 8 16 31
TYPE (11) CODE (0 or 1) CHECKSUM
UNUSED (MUST BE ZERO)
INTERNET HEADER + FIRST 64 BITS OF DATAGRAM
...
60
Traceroute Program (4)
● Example
$ traceroute bsd1.cs.nctu.edu.tw
traceroute to bsd1.cs.nctu.edu.tw (140.113.235.131), 64 hops max, 40 byte packets
1 e3rtn.csie.nctu.edu.tw (140.113.17.254) 0.377 ms 0.365 ms 0.293 ms
2 ProjE27-254.NCTU.edu.tw (140.113.27.254) 0.390 ms 0.284 ms 0.391 ms
3 140.113.0.58 (140.113.0.58) 0.292 ms 0.282 ms 0.293 ms
4 140.113.0.165 (140.113.0.165) 0.492 ms 0.385 ms 0.294 ms
5 bsd1.cs.nctu.edu.tw (140.113.235.131) 0.393 ms 0.281 ms 0.393 ms
network 1 network 3
if1 if4
62
Traceroute Program –
0 15 16 31
4-bit
4-bit 8-bit type of service
header 16-bit total length (in bytes)
version (TOS)
length
3-bit
16-bit identification 13-bit fragment offset
flags
65
IP Routing – Processing in IP Layer
routing route netstat
daemon command command
ICMP
yes
P
ICM ects our packet (one of
ir
red no
am our IP addresses or
d d atagr ble)
IP output: a r
forw rding en
a broadcast addrs) ?
routing rw a
calculate next hop (if fo
table router (if necessary) so u r
ce ro
utin
g process IP options
IP input queue
IP layer
network interface 66
IP Routing – Routing Table (1)
● Routing Table
○ Command to list: netstat -rn
○ Flag
■ U: the route is up
■ G: the route is to a router (indirect route)
● Indirect route: IP is the dest. IP, MAC is the router’s MAC
■ H: the route is to a host (Not to a network)
● The dest. filed is either an IP address or network address
■ S: the route is static
○ Expire: expiration time for each route
$ netstat -rn
Routing tables
Internet:
Destination Gateway Flags Netif Expire
Default 140.113.17.254 UGS em0
127.0.0.1 link#2 UH lo0
140.113.17.0/24 link#1 U em0
140.113.17.225 link#1 UHS lo0
67
IP Routing – Routing Table (2) 1.
2.
dst. = sun
dst. = slip
3. dst. = 192.207.117.2
4. dst. = svr4 or 140.252.13.34
● Example: 5. dst. = 127.0.0.1
Internet
140.252.104.1
gateway
.4 Ethernet, subnet 140.252.1
SLIP
140.252.1.29
slip .65 subnet .66 bsdi sun svr4
140.252.13.64 .35 .33 .34
Ethernet, subnet 140.252.13.32
68
ICMP – No Route to Destination
● If there is no match in routing table
○ If the IP datagram is generated on the host
■ “host unreachable” or “network unreachable”
○ If the IP datagram is being forwarded
■ ICMP “host unreachable” error message is generated and sends back to sending host
■ ICMP message
● Type = 3, code = 0 for host unreachable
● Type = 3, code = 1 for network unreachable
0 7 8 15 16 31
69
ICMP – Redirect Error Message (1)
● Concept
○ Used by router to inform the sender that the datagram should be sent to a
different router
○ This will happen if the host has a choice of routers to send the packet to
■ Ex:
● R1 found sending and receiving interface are the same
host
(1) IP datagram
(2) IP datagram
R1 R2
final destination 70
ICMP – Redirect Error Message (2)
● ICMP redirect message format
○ Code 0: redirect for network
○ Code 1: redirect for host
○ Code 2: redirect for TOS and network (RFC 1349)
○ Code 3: redirect for TOS and hosts (RFC 1349)
0 7 8 15 16 31
71
ICMP – Router Discovery Messages (1)
● Dynamic update host’s routing table
○ ICMP router solicitation message ( 懇求 )
■ Host broadcast or multicast after bootstrapping
○ ICMP router advertisement message
■ Router response
■ Router periodically broadcast or multicast
● Format of ICMP router solicitation message
0 7 8 15 16 31
72
ICMP – Router Discovery Messages (2)
● Format of ICMP router advertisement message
○ Router address
■ Must be one of the router’s IP address
○ Preference level
■ Preference as a default router address
0 7 8 15 16 31
...
73
UDP – User Datagram Protocol
國立陽明交通大學資工系資訊中心
Computer Center of Department of Computer Science, NYCU
74
UDP
● No reliability
○ Datagram-oriented, not stream-oriented protocol
● UDP header
○ 8 bytes
○ Source port and destination port
○ UDP length: ≧ 8
■ Identify sending and receiving process
0 7 8 15 16 31
75
UDP
● Application
○ VoIP
○ VPN (OpenVPN over UDP)
○ DNS
○ SNMP
○ Quick UDP Internet Connections (QUIC)
■ Designed by Google, based on UDP
■ Renamed to “HTTP/3”
■ Keep reliability as TCP, but less latency
● As most HTTP connections will demand TLS, QUIC makes the exchange of setup
keys and supported protocols part of the initial handshake process.
● During network-switch events, reuse old connection instead of creating a new one
as TCP does.
76
TCP – Transmission Control Protocol
國立陽明交通大學資工系資訊中心
Computer Center of Department of Computer Science, NYCU
77
TCP
● Services
○ Connection-oriented
■ Establish TCP connection before exchanging data
○ Reliability
■ Acknowledgement when receiving data
■ Retransmission when timeout
■ Ordering
■ Discard duplicated data
■ Flow control
78
TCP – Header (1)
0 15 16 31
79
TCP – Header (2)
● Flags
○ SYN
■ Establish new connection
○ ACK
■ Acknowledgement number is valid
■ Used to ack previous data that host has received
○ RST
■ Reset connection
○ FIN
■ The sender is finished sending data
80
TCP connection – establishment and termination
Three-way handshake
(0)
0 8 3 5 2 1:1 823083521 segment 2
SYN 182 3 10 2 4 >
1 4 1 5 5 3 1522, <mss
ACK
segment 3 ACK 1823
083522
FIN 14155315
segment 4 22:141553152
2(0)
ACK 1823083
5 22
1553152
3 segment 5
ACK 14
2(0) segment 6
8 3 52 2 :1 82308352
0
FIN 1823 5531523
ACK 141
segment 7 ACK 182308
3523
國立陽明交通大學資工系資訊中心
Computer Center of Department of Computer Science, NYCU
82
Introduction – Encapsulation
● Multiplexing
○ Gathering data from multiple sockets, enveloping data with header
user data
application
Appl
user data
header
TCP
TCP
application data
header
TCP segment IP
IP TCP
application data
header header Ethernet
IP datagram driver
Ethernet IP TCP Ethernet
application data
header header header trailer Ethernet
Ethernet frame
83
Introduction – Decapsulation
● Demultiplexing
○ Delivering received segments to correct socket
demultiplexing based on
frame type in Ethernet header
Ethernet
Driver
Incoming frame 84
Introduction – Addressing
● Addressing
○ Nearby (same network) Handles
User Application
FTP FTP Protocol FTP Process Details
Application
Client Server
TCP Protocol
Transport TCP TCP Kernel Handles
Communication
Details
IP Protocol
Network IP IP
TCP Protocol
Transport TCP TCP
Router
IP Protocol IP Protocol
Network IP IP IP
■ Default
DSCP Class Binary DSCP IPP Binary IPP Names
● Best-effort traffic Selector Names Values Values
■ Expedited Forwarding (EF) Default/CS0* 000000 000 Routine
● Dedicated to low-loss, low-latency traffic CS1 001000 001 Priority
■ Class Selector CS2 010000 010 Immediate
● Backward compatibility with the IP Precedence field
CS3 011000 011 Flash
■ Assured Forwarding (AF)
CS4 100000 100 Flash Override
● Give assurance of delivery under prescribed conditions
CS5 101000 101 Critic/Critical
● ECN: Explicit Congestion Notification (2-bit) CS6 110000 110 Internetwork Control
○ FreeBSD 8.0 implement ECN support for TCP CS7 111000 111 Network Control
■ Enable ECN via sysctl(8)
● net.inet.tcp.ecn.enable=1
Queue Class Low Drop Probability Medium Drop Probability High Drop Probability
■ Linux Kernel supports ECN for TCP since version 2.4.20
Name/Dec/Bin Name/Dec/Bin Name/Dec/Bin
single IP datagram
● Fragmentation offset (13-bit)
○ Specify the offset of a particular fragment relative to the beginning of
the original unfragmented IP datagram
● Flags (3-bit)
○ All these three fields are used for fragmentation
Reserved Don’t Fragment (DF) More Fragments (MF)
89
0 15 16 31
4-bit
4-bit 8-bit type of service
header 16-bit total length (in bytes)
● TTL (8-bit)
32-bit destination IP address
● Protocol (8-bit)
○ Used to demultiplex to other protocols
○ TCP, UDP, ICMP, IGMP
● Header checksum (16-bit)
○ Calculated over the IP header only
○ If checksum error, IP discards the datagram and no error message is
generated
90
IP Fragmentation (1)
● MTU limitation
○ Before network-layer to link-layer
■ IP will check the size and link-layer MTU
■ Do fragmentation if necessary
○ Fragmentation may be done at sending host or routers
○ Reassembly is done only in receiving host
IP datagram (1501 bttes)
IP UDP
UDP data
header header
20 bytes 8 bytes 1473 bytes
IP UDP IP
UDP data
header header header
20 bytes 8 bytes 1473 bytes 20 bytes 1 byte
packet (1500 bytes) packet
91
IP Fragmentation (1)
identification: which unique IP datagram
flags: more fragments?
fragment offset offset of this datagram from the beginning of original datagram
IP UDP
UDP data
header header
20 bytes 8 bytes 1473 bytes
IP UDP IP
UDP data
header header header
20 bytes 8 bytes 1473 bytes 20 bytes 1 byte
packet (1500 bytes) packet
92
IP Fragmentation (3)
● Issues of fragmentation
○ One fragment lost, entire datagram must be retransmitted
○ If the fragmentation is performed by intermediate router, there is no way for
sending host how fragmentation did
○ Fragmentation is often avoided
■ There is a “don’t fragment” bit in flags of IP header
0 15 16 31
4-bit 4-bit header 8-bit type of service
length
16-bit total length (in bytes)
version (TOS)
3-bit
16-bit identification 13-bit fragment offset
flags
8-bit time to live
8-bit protocol 16-bit header checksum
(TTL) 20 bytes
32-bit source IP address
data
93
ICMP Unreachable Error – Fragmentation Required
● Type=3, code=4
○ Router will generate this error message if the datagram needs to be
fragmented, but the “don’t fragment” bit is turn on in IP header
● Message format
0 7 8 15 16 31
94
ICMP – Source Quench Error
● Type=4, code=0
○ May be generated by system when it receives datagram at a rate that
is too fast to be processed
○ Host receiving more than it can handle datagram
■ Send ICMP source quench or
■ Throw it away
○ Host receiving UDP source quench message
■ Ignore it or
■ Notify application
95
Appendix of IP Options: IP Timestamp Option
● IP Timestamp Option
○ Similar to RR option
○ Record Timestamp in option field
■ code, len, ptr are the same as IP RR option
■ OF
● Overflow field
● Router will increment OF if it can’t add a timestamp because of no room left
■ FL
● Flags
● 0: only timestamp
● 1: both timestamp and IP address
● 3: the sender initiates the options with up to 4
pairs of IP address and timestamp
40 bytes
O F
code len ptr timestamp #1 timestamp #2 timestamp #3 ... timestamp #9
F L
1 1 1 4 bytes 4 bytes 4 bytes 4 bytes
96