0% found this document useful (0 votes)
42 views36 pages

Distributed Programming With Sockets: Wondimagegn D

This document provides an overview of distributed programming with sockets. It discusses computer networks and how they allow communication between computers by sending network messages. It describes the Internet Protocol (IP) and how higher level protocols like TCP and UDP are built on top of IP. It also covers IP addressing and name resolution, how domain names are converted to IP addresses via the Domain Name Service (DNS).

Uploaded by

mo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views36 pages

Distributed Programming With Sockets: Wondimagegn D

This document provides an overview of distributed programming with sockets. It discusses computer networks and how they allow communication between computers by sending network messages. It describes the Internet Protocol (IP) and how higher level protocols like TCP and UDP are built on top of IP. It also covers IP addressing and name resolution, how domain names are converted to IP addresses via the Domain Name Service (DNS).

Uploaded by

mo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Distributed Programming with Sockets

Wondimagegn D.

September 29, 2019

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 1 / 36


Overview

1 A Very Brief Introduction to Computer Networks

2 IP Addressing and Name Resolution

3 TCP

4 I/O Multiplexing

5 Server Structures

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 2 / 36


A Very Brief Introduction to Computer Networks

Introduction

There is one way computers can communicate together


By sending network messages to each other
All other kinds of communications are built from messages
There is one way programs can send/receive network messages
Through sockets
All other communication paradigms are built from sockets

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 3 / 36


A Very Brief Introduction to Computer Networks

Two Different Kinds of Networks

Circuit switching
One electrical circuit assigned per communication
Example: the (analog) phone network
Guaranteed constant quality of service
Waste of resources (periods of silence), fault tolerance
Packet switching
Messages are split into packets, which are transmitted independently
Packets can take different routes
Network infrastructures are shared among users
Example: the Internet, and most computer networks
Good resource usage, fault tolerance
Variable QOS, packets may be delivered in the wrong order

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 4 / 36


A Very Brief Introduction to Computer Networks

Internet Protocol

Most computer networks use the Internet Protocol


The base protocol: IP (Internet Protocol)
Send packets of limited size
Up to 65,536 bytes
But if the MTU (Maximum Transmission Unit) of some link on the
path is lower, the packet will be fragmented (IPv4) or dropped (IPv6)
Minimum allowed MTU is 576 bytes; in practice nowadays higher
Each packet is sent to an IP address
Example:10.5.55.87
IP offers no guarantee:
Packets may get lost
Packets may be delivered twice
Packets may be delivered in wrong order
Packets may be corrupted during transfer
Usually, programs do not use IP directly
All other Internet Protocols are built over IP
Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 5 / 36
A Very Brief Introduction to Computer Networks

UDP: User Datagram Protocol

UDP is very similar to IP


Send/receive packets
No guarantee
In UDP, packets are called datagrams
Each datagram is sent to an IP address and a port number
Example:10.5.55.87 port=1234
Ports allow to distinguish between several programs running
simultaneously on the same machine
Program A uses port 1234
Program B uses port 1235
When a datagram is received, the OS knows which program it should
be delivered to.

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 6 / 36


A Very Brief Introduction to Computer Networks

TCP: Transmission Control Protocol

TCP establishes connections between pairs of machines


To communicate with a remote host, we must first connect to it
TCP provides the illusion of a reliable data flow to the users
Flows are split into packets, but the users don’t see them
TCP guarantees that the data sent will not be lost,unordered,
corrupted, etc.
The sender gives numbers to packets so that the receiver can reorder
them.
The receiver acknowledges received packets so that the sender can
retransmit lost packets.
Communication is bi-directional
The same connection can be used to send data in the two directions
E.g., a request and its response

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 7 / 36


A Very Brief Introduction to Computer Networks

A Very Simplified View

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 8 / 36


IP Addressing and Name Resolution

IP Address Conversion

IP Addresses
32-bit integers: 2183468070 (good for computers!)
Dotted strings: 130.37.20.38 (good for humans!)
DNS name: www.aait.edu.et (even better for humans!)
You can convert between integer and dotted string:

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 9 / 36


IP Addressing and Name Resolution

IP Address Conversion

IP Addresses
32-bit integers: 2183468070 (good for computers!)
Dotted strings: 130.37.20.38 (good for humans!)
DNS name: www.aait.edu.et (even better for humans!)
You can convert between integer and dotted string:

#i n c l u d e <a r p a / i n e t . h>
i n a d d r t i n e t a d d r ( c o n s t c h a r ∗ d o t t e d ) ; /∗ D o t t e d t o Network ∗/
c h a r ∗ i n e t n t o a ( s t r u c t i n a d d r n e t w o r k ) ; /∗ Network t o D o t t e d ∗/

in addr t is an unsigned 32-bit integer


struct in addr is a structure containing an in addr t

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 10 / 36


IP Addressing and Name Resolution

Big/Little-endian, Network Ordering

Computers represent numbers in different orderings:


32-bit integers: 2183468070 (good for computers!)
Big − endian : 0x12345678 => 0x120x340x560x78 E.g PowerPC,
Sparc, UltraSparc
Little − endian : 0x12345678 => 0x780x560x340x12 E.g Alpha, i386,
AMD64
To convert numbers: host ¡—¿ network ordering

#i n c l u d e <n e t i n e t / i n . h>

uint16 t htons ( u i n t 1 6 t value ); /∗ Host t o Network , 16 b i t s ∗/


uint32 t htonl ( uint32 t value ); /∗ Host t o Network , 32 b i t s ∗/
uint16 t ntohs ( u i n t 1 6 t value ); /∗ Network to host , 16 b i t s ∗/
uint32 t ntohl ( uint32 t value ); /∗ Network to host , 32 b i t s ∗/

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 11 / 36


IP Addressing and Name Resolution

sockaddr in: Unix Network Addresses

Unix represents network addresses with a struct sockaddr


This structure is generic for all kinds of networks
For Internet addresses, we use sockaddr in

struct sockaddr in {
s a f a m i l y t s i n f a m i l y ; /∗ s e t t o AF INET ∗/
i n p o r t t s i n p o r t ; /∗ P o r t number ∗/
s t r u c t i n a d d r s i n a d d r ; /∗ C o n t a i n s t h e IP a d d r e s s ∗/
};

struct in addr {
i n a d d r t s a d d r ; /∗ IP a d d r e s s i n n e t w o r k o r d e r i n g ∗/
};

sin family: indicates which type of address. Always set to AF INET.


sin port: port number, in network byte order
sin addr.s addr: IP address, in network byte order. To represent an
unspecified IP address, set it to htonl(INADDR ANY).

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 12 / 36


IP Addressing and Name Resolution

Domain Names

Internet Protocols are all based on IP addresses


But IP addresses are hard for humans to remember
Our web server: http://10.5.10.3
Better: http://www.aait.edu.et
Using Domain Names
Domain names cannot be used directly by network protocols
Network protocols only use IP addresses
But you can convert domain names into IP addresses thanks to DNS
Domain Name Service (DNS): handles Domain Name resolution
Hundreds of thousands of servers around the world that cooperate to
resolve addresses
To learn more on how this works, go to the Distributed Systems book
page 14!

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 13 / 36


IP Addressing and Name Resolution

Converting Domain Names to IP

Conversion is done by gethostbyname()

#i n c l u d e <n e t d b . h>
s t r u c t h o s t e n t ∗ g e t h o s t b y n a m e ( c o n s t c h a r ∗name ) ;

...where struct hostent is as follows

struct hostent {
c h a r ∗h name ; /∗ o f f i c i a l name o f h o s t ∗/
c h a r ∗∗ h a l i a s e s ; /∗ a l i a s l i s t ∗/
int h addrtype ; /∗ h o s t a d d r e s s t y p e ∗/
i n t h l e n g t h ; /∗ l e n g t h o f a d d r e s s ∗/
c h a r ∗∗ h a d d r l i s t ; /∗ l i s t o f a d d r e s s e s ∗/
};

h addr list: A null-terminated array of network addresses for the host

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 14 / 36


IP Addressing and Name Resolution

gethostbyname()

Example

#i n c l u d e <n e t d b . h>
i n t p r i n t r e s o l v ( c o n s t c h a r ∗name ) {
struct hostent ∗resolv ;
s t r u c t i n a d d r ∗addr ;
r e s o l v = g e t h o s t b y n a m e ( name ) ;
i f ( r e s o l v==NULL) {
p r i n t f ( ” A d d r e s s n o t f o u n d f o r %s \n” , name ) ;
r e t u r n −1;
}
else {
a d d r = ( s t r u c t i n a d d r ∗) r e s o l v −>h a d d r l i s t [ 0 ] ;
p r i n t f ( ”The IP a d d r e s s o f %s i s %s \n” , name , i n e t n t o a (∗ a d d r ) ) ;
return 0;
}
}

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 15 / 36


TCP

TCP Sockets

Defined in RFC 793


Popular TCP-based protocols
TELNET
FTP – File Transfer Protocol
SMTP – Simple Mail Transfer Protocol
HTTP – Hyper Text Transfer Protocol
SSH – Secure Shell

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 16 / 36


TCP

TCP Socket Functions

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 17 / 36


TCP

The TCP three-way handshake

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 18 / 36


TCP

Creating a Socket

Some functions are the same as in UDP


socket(): creates a socket

s o c k f d = s o c k e t ( AF INET , SOCK STREAM , 0 ) ;

bind(): to specify the address of a socket


Only useful for server sockets
Exactly like UDP sockets

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 19 / 36


TCP

listen(): Setting a Server Socket

By default, TCP sockets are created as client sockets


A client socket cannot receive incoming connections
Server sockets need to maintain more state
TCP establishes connections thanks to the three-way handshake:
Server sockets must allocate resources for handling connections
To convert a client socket to a server socket, use listen()
And indicate how many not-yet-accepted connections can be supported
in parallel
If this number is exceeded, the server will refuse connections

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 20 / 36


TCP

listen()

The interface is simple


#i n c l u d e <s y s / s o c k e t . h>
i n t l i s t e n ( i n t sockfd , i n t backlog ) ;

sockfd: the socket descriptor


backlog: the size of the buffer (often set to 5)
Return value: 0 for success, -1 for error
Note
backlog is not a limit on the number of connections established in
parallel!
It only limits the number of pending connections (i.e., connections
before having been accepted)

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 21 / 36


TCP

Initiating a TCP connection

Clients initiate connections to servers thanks to connect():


#i n c l u d e <s y s / t y p e s . h>
#i n c l u d e <s y s / s o c k e t . h>
i n t connect ( i n t sockfd , const s t r u c t sockaddr ∗serv addr , s o c k l e n t addrlen ) ;

sockfd: the socket descriptor


serv addr: a pointer to a struct sockaddr in containing the address
where to connect to
Obviously you must specify the destination IP address and port number
addrlen: sizeof(struct sockaddr in)
Return value: 0 for success, -1 for error
connect() binds the client’s socket to a random unused port

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 22 / 36


TCP

Waiting for an Incoming Connection

accept() blocks the process until an incoming connection is received


When a connection is received, accept() creates a new socket
dedicated to this connection
The new socket is used to communicate with the client
The original socket is immediately ready to wait for other connections

accept():
#i n c l u d e <s y s / t y p e s . h>
#i n c l u d e <s y s / s o c k e t . h>
i n t a c c e p t ( i n t s o c k f d , s t r u c t s o c k a d d r ∗addr , s o c k l e n t ∗ a d d r l e n ) ;

sockfd: the socket descriptor


addr: a pointer to a sockaddr in structure where the address of the
client will be copied
addrlen: a pointer to an integer containing the size of addr
Return value: the descriptor of the newly created socket, or -1 for
error

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 23 / 36


TCP

Example use of accept

Example

i n t s o c k , newsock , r e s ;
sockaddr in client addr ;
socklen t addrlen ;
( t h e s o c k e t s o c k i s c r e a t e d and bound )
r e s = l i s t e n ( sock , 5 ) ;
i f ( r e s < 0) { . . . }
addrlen = sizeof ( struct sockaddr in );
newsock = a c c e p t ( s o c k , ( s t r u c t s o c k a d d r ∗) &c l i e n t a d d r , &a d d r l e n ) ;
i f ( newsock < 0 ) { . . . }
else
{
p r i n t f ( ” C o n n e c t i o n from %s !\ n” , i n e t n t o a ( c l i e n t a d d r . s i n a d d r ) ) ;
}

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 24 / 36


TCP

Writing data to a socket

write works the same for sending data to a TCP socket or writing to
a file

#i n c l u d e <u n i s t d . h>
s s i z e t w r i t e ( i n t s o c k f d , c o n s t v o i d ∗buf , s i z e t count ) ;

sockfd: socket descriptor


buf: buffer to be sent
count: size of buffer
Return value: number of bytes sent, or -1 for error
Attention: When writing to a socket, write may send fewer bytes
than requested
Due to limits in internal kernel buffer space
Always check the return value of write, and resend the non-
transmitted data

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 25 / 36


TCP

Reading data from a socket

read() blocks the process until receiving data from the socket

#i n c l u d e <u n i s t d . h>
s s i z e t r e a d ( i n t s o c k f d , v o i d ∗buf , s i z e t count ) ;

sockfd: socket descriptor


buf: buffer where to write the data read
count: size of buffer
Return value: number of bytes read, or -1 for error
Attention: When reading from a socket, read() may read fewer bytes
than requested
It delivers the data that have been received
This does not mean that the stream of data is finished, there may be
more to come
The end-of-file (EOF) is notified to the read by read() returning 0

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 26 / 36


TCP

Closing a TCP socket

To stop sending data to a socket, use close():

#i n c l u d e <u n i s t d . h>
int close ( int sockfd );

Anyone can call this, either the client or the server


This sends an EOF message to the other party
When receiving an EOF, read returns 0 bytes
Subsequent reads and writes will return errors

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 27 / 36


TCP

Asymmetric Disconnection

Sometimes you may want to tell the other party that you are finished,
but let it finish before closing the connection
#i n c l u d e <s y s / s o c k e t . h>
i n t shutdown ( i n t s o c k f d , i n t how ) ;

how: SHUT WR for stopping writing, SHUT RD for stopping reading


When one party has closed the connection, the other can still write
data (and then close the connection as well)
To initiate a disconnection To receive a disconnection
• shutdown(fd,SHUT WR) • read() receives an EOF
• Keep on reading the last data • Keep on writing the last data
• Until receiving an EOF • Then close() the socket
• close() the socket

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 28 / 36


I/O Multiplexing

I/O Multiplexing

How can a program handle multiple file descriptors simultaneously?


accept() and read() block programs until something is received
How can you wait for connections/data from multiple sockets?
Several methods:
Use multiple processes
Resource consuming, hard to program
Use non-blocking I/O
It works for read() but not for accept()
select() monitors multiple file descriptors
It blocks the program until one of them is ready for reading or writing
poll() is similar to select()
With additional information about streams

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 29 / 36


Server Structures

Server Structures

Often, a server accepts connections to one (TCP) socket


But it wants to process several requests simultaneously
Better use of the server’s resources
Incoming requests can start being processed immediately after reception
Depending on its nature, a server can receive between 0 and dozens
of thousands of requests per second
Several server structures can be used:
Iterative (i.e., not concurrent)
One child per client
Prefork
Select loop
Many other variants. . .

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 30 / 36


Server Structures

Iterative Servers

An iterative server treats one request after the other

i n t f d , newfd ;
while (1) {
newfd = a c c e p t ( fd , . . . ) ;
t r e a t r e q u e s t ( newfd ) ;
c l o s e ( newfd ) ;
}

Simple
Potentially low resource utilization
- If treat request() does not utilize all the CPU, resources are wasted
Potentially long waiting queue of incoming connections waiting to be
accept()ed
Increased request treatment latency
If the queue increases, the server may start rejecting incoming
connections

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 31 / 36


Server Structures

One Child Per Client

A new process is created to handle each connection

void s i g c h l d ( int ) {
w h i l e ( w a i t p i d ( 0 , NULL ,WNOHANG)>0) {}
s i g n a l ( SIGCHLD , s i g c h l d ) ;
}
i n t main ( ) {
i n t f d , newfd , p i d ;
s i g n a l ( SIGCHLD , s i g c h l d ) ;
while (1) {
newfd = a c c e p t ( fd , . . . ) ;
i f ( newfd <0) c o n t i n u e ;
pid = fork ( ) ;
i f ( p i d ==0) { t r e a t r e q u e s t ( newfd ) ; e x i t ( 0 ) ; }
else
{ c l o s e ( newfd ) ; }
}
}

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 32 / 36


Server Structures

One Child Per Client

This is the most common type of concurrent server


Several requests are treated simultaneously
Incoming requests are accepted and treated immediately
It may not be suitable for highly loaded servers
fork() takes a lot of time
You cannot limit the number of concurrent requests

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 33 / 36


Server Structures

Preforked

The server first creates a pool of processes dedicated to treating


requests

#d e f i n e NB PROC 10
v o i d r e c v r e q u e s t s ( i n t f d ) { /∗ An i t e r a t i v e s e r v e r ∗/
int f ;
while (1) {
f=a c c e p t ( fd , . . . ) ;
treat request ( f );
close ( f );
}
}
i n t main ( ) {
i n t fd ;
f o r ( i n t i =0; i<NB PROC ; i ++) { /∗ C r e a t e NB PROC c h i l d r e n ∗/
i f ( f o r k ()==0) r e c v r e q u e s t s ( f d ) ;
}
w h i l e ( 1 ) p a u s e ( ) ; /∗ The p a r e n t p r o c e s s d o e s n o t h i n g ∗/
}

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 34 / 36


Server Structures

Preforked

Highly loaded servers are often structured as preforked servers


Concurrent request treatment
No waisted time to fork() each time a connection is received
You can limit the number of concurrent requests
For example, the Apache Web server is structured that way
There are variants to this model
With a thread pool instead of a process pool
Not all systems allow multiple processes to accept() the same socket
simultaneously
Sometimes it is necessary to synchronize accesses to accept()
lock mutex(); accept(); unlock mutex();

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 35 / 36


Server Structures

Select Loop

A single process manages multiple connections simultaneously thanks


to select()
This is quite difficult to do correctly
You must split request treatment into a set of non-blocking stages
You must maintain a list of data structure containing the current state
of each concurrent request
Which stage it is in
All internal data it needs
For example, the Squid Web cache is implemented as a select loop

Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 36 / 36

You might also like