Department of Computer Science: Notes On Interprocess Communication in Unix
Department of Computer Science: Notes On Interprocess Communication in Unix
§§These notes explain how you can write "distributed programs" in C or C++ running over Unix. In
particular, we tell you how to arrange that a process in one computer can send information that is
received by a process in another computer.
The facilities for interprocess communication (IPC) originate from 4.2 BSD Unix. They are
implemented as a collection of system functions in the Unix kernel and are accessed through a
system call interface.
There are two approaches to the use of IPC. In the first approach communication consists of simple
one-off messages known as datagrams. In the second approach, the communicating processes
establish a prior connection and then communicate by transmitting information via the connection.
The first section of these notes introduces the objects used for datagram communication in Unix IPC,
some of which are also used in stream communication. Sockets are the end points of communication in
processes, computers have internet addresses and port numbers. Messages are sent to socket
addressses. Section 2 introduces the system calls for using these objects to send datagrams. Section 3
explains how to make processes communicate via connections.
computer computer
any port
s1 message s2
sending
process receiving
agreed port process
kylie: 192.135.233.215
sending process receiving process
s1 = socket(...) s2 = socket(...)
bind(s1, to any port number on local computer); bind(s2, agreed socket address
sendto(s1, agreed socket address) recvfrom(s2, ...)
See Figure 1 in which a process is oval and a computer is a rectangle. Sockets and ports are shown as
matching semicircles - sockets are shaded and shown within a process. The binding is indicated by
putting the socket and port together. After a remote process has bound its socket to a socket address,
the socket may be addressed indirectly by another process referring to the appropriate socket
address.
1.5 Messages
Every message is sent by a process on one computer through a socket with a particular protocol via a
port number on that computer. It is received by a process on another computer through a socket using
the same protocol via a port number on the latter computer. To summarise the steps:
a process on the destination computer opens a socket and binds it
a process on the source computer sends a message through a socket via a port number on its own
computer
the message arrives at the destination port number on destination computer
waiting process on destination computer receives message through a socket
(both sockets must use the same protocol)
Sockets are private to processes and port numbers belong to computers. Processes can use the socket
addresses of local or remote port numbers. Therefore the path of communication is:
socket (in sender ) → local port number → destination socket address → socket (in receiver)
where sender and receiver are processes.
Unix IPC 3
Messages are uninterpreted sequences of 8-bit bytes (in C terminology, unsigned char). It is t h e
programmer’s responsibility to make sure that the message is intelligible to the receiver, which
may not be on the same type of computer as the sender.
Eight bit bytes may be transmitted in messages as they are. However, anything larger than a single
byte (e.g. an integer) must be sent in a standard network ordering. The library functions htons and
htonl (host to network short or long) may be used to convert a short or long integer from host ordering
to network ordering. The functions ntohs and ntohl(network to host short or long) may be used to
convert them back
You should also note that C structures may have slightly different sizes on different machines,
with invisible padding added between elements of different types so that integers are aligned
conveniently.
To summarise: from the point of view of a process, the end points of communication are a private
socket descriptor and a public socket address. The socket descriptor must refer to a socket created by
the process concerned. The socket address must have been bound to a socket descriptor at t h e
destination process. In other words, sender and receiver respectively:
send a message through one of my sockets to socket address
receive a message through one of my sockets
For C++ programs, the prototypes of the system calls are required in an extern “C” definition. The
most convenient way of providing these is to include the file (as addressed at QMW):
/import/GCC/lib/g++-include/sys/socket.h
If you need to use the library function inet_ntoa (as illustrated in Figure 3 of these notes) you will
have to include its prototype.
extern "C" {
char * inet_ntoa(struct in_addr);
}
To send a message from one process to another process, each process must first create its own socket. In
Figure 1, the sender’s socket descriptor is s1 and the receiver’s s2 - these descriptors are private to
the processes that created them.
Secondly, the two processes must agree on the address of the socket to be used. That amounts to
agreeing to the internet address and port number of the recipient. Once this has been decided, t h e
receiving process is started on the computer with the agreed internet address and binds its socket to
the agreed socket address.
The sender binds its socket to a socket address referring to any available local port number. The
recipient calls recvfrom in order to receive an incoming message through its socket. The sender calls
Unix IPC 4
sendto and names s1 and the agreed socket address. Figure 2 is a sample procedure that sends two
messages to a specified port number on a given machine. For example, the procedure might be called:
sender("hello", "how are you", "kylie", 4567);
to send the two messages "hello" and "how are you" to port number 4567 on machine named "kylie".
2.1 Creating a socket
First a socket must be created:
int socket (int domain, int type, int protocol);
For example, in Figure 2, we create a new socket for datagram communication in the Internet domain.
A socket is an object that contains information as to the type of communication required, e.g
datagram or stream, the protocol in use (e.g. UDP or TCP), options (e.g. broadcast) and references to
buffers for the incoming and outgoing messages. It has operations for creating it, binding it to a
socket address, sending and receiving messages through it (or others associated with connections).
/* send 2 messages to machine at port number */
void sender(char *message1, char *message2, char *machine, int port)
{ int s, n;
struct sockaddr_in mySocketAddress, yourSocketAddress;
if(( s = socket(AF_INET, SOCK_DGRAM, 0))<0) {
perror("socket failed");
return;
}
setBroadcast(s); /*see Section 2.7 */
makeLocalSA(&mySocketAddress);
if( bind(s, &mySocketAddress, sizeof(struct sockaddr_in))!= 0){
perror("Bind failed\n");
close (s);
return;
}
makeDestSA(&yourSocketAddress,machine, port);
if( (n = sendto(s, message1, strlen(message1), 0, &yourSocketAddress,
sizeof(struct sockaddr_in))) < 0) perror("Send 2 failed\n");
if( (n = sendto(s, message2, strlen(message2), 0, &yourSocketAddress,
sizeof(struct sockaddr_in))) < 0) perror("Send 2 failed\n");
close(s);
}
void makeLocalSA(struct sockaddr_in *sa)
{
sa->sin_family = AF_INET;
sa->sin_port = htons(0);
sa-> sin_addr.s_addr = htonl(INADDR_ANY);
}
void makeDestSA(struct sockaddr_in * sa, char *hostname, int port)
{ struct hostent *host;
sa->sin_family = AF_INET;
if((host = gethostbyname(hostname))== NULL){
printf("Unknown host name\n");
exit(-1);
}
sa-> sin_addr = *(struct in_addr *) (host->h_addr);
sa->sin_port = htons(port);
}
The first argument of the socket system call gives the communication domain in which the socket
should be created. The second argument specifies whether we want datagram or stream. The last
argument may be used to specify a particular protocol, but setting it to zero causes the system to
select a suitable protocol.
Unix IPC 5
2.2 Binding a socket to a socket address
After a socket has been created it must be bound to a socket address:
int bind ( int s, struct sock_addr * socketName, int addrlength);
The first argument is a socket descriptor returned by a socket system call. The second argument is a
sock_addr structure specifying the name of the socket. The third argument is the size of t h e
structure in the second argument.
A socket address is represented by a structure whose first field specifies the communication domain.
The following fields give the information expected for that domain. In Figure 2, the communication
domain is the Internet domain and the other fields require a port number and an internet address.
The constant AF_INET given in the first argument is actually redundant (because it was supplied in
the socket call), but you have to fill it in all the same.
The sockaddr_in structure is defined in the file /usr/include/netinet/in.h:
/* Socket name, internet style */ /* internet address */
struct sockaddr_in { struct in_addr {
short sin_family; union {
u_short sin_port; - - - - - - - - /* we don't need to know about this*/
struct in_addr sin_addr; --------
char sin_zero[8]; u_long S_addr;
}; } S_un;
The fields of a sockaddr_in structure are used to fill in the protocol family, the port number and t h e
internet address. The struct in_addr sin_addr field is another structure in which the field s_addr
should be filled in network order. The port number must also be in network order. A
(struct sockaddr_in *) structure which is the Internet form of a socket address may be used as an
argument where the corresponding parameter requires a (struct sockaddr *). This applies to t h e
system calls bind, sendto and recvfrom.
When an application plans to use a socket to receive a message, it must decide on a port number
which will also be used in the sendto call in the process that transmits the message. The
destination socket address in the sender procedure in Figure 2 uses the agreed port, given as
argument. The procedure makeDestSocketAddress is used to fill in a destination internet domain
socket address.
The library function gethostbyname takes the name of a computer as argument and returns a pointer
to a structure (struct hostent) whose fields containing its Internet address. The information i t
supplies is already in network order. You should try to design programs to call this function once
only for each computer involved in the communication.
Another library function inet_ntoa may be used to make an ascii string from an internet addresses.
Its argument is a struct in_addr. See Figure 3 for a procedure for printing socket addresses.
When an application plans to send a message it gives zero as the port number and the system will
select an unused port number. The first binding in Figure 2 binds the socket with descriptor s to t h e
socket address mySocketAddress.
The local internet address is specified as a pattern rather than a fixed address - the value
INADDR_ANY in the procedure makeLocalSocketAddress in Figure 2 means: “use any of my IP
addresses to accept messages”.
2.3 To Send a message
Sendto is used to transmit a message to another socket.
int sendto(int s, char * msg, int len, int flags, struct sockaddr *to, int tolen)
Unix IPC 6
The first argument is a socket descriptor returned by a socket system call. The next two arguments
supply the message and the number of bytes in the message. The flags argument is normally zero.
The socket address of the receiver is given in to with tolen specifying its size.
The sendto call specifies a message to be sent to a socket address (see Figure 2). It hands the message
to the underlying UDP and IP protocols and returns the actual number of characters sent. As we
have requested datagram service, the message is transmitted to its destination without further ado
or acknowledgement. The message will in fact be lost unless a process has already opened a socket
and bound it to the destination socket address. Messages will if necessary be queued until recvfrom is
called on that socket. If the message is too long to be sent, there is an error return (and the message is
not transmitted). Most environments restrict the length of datagrams to 8 kilobytes. It is a good idea
to limit the size of the messages sent by your programs to something acceptable by all of t h e
computers in the network. I suggest you use 1K for most purposes, but if you need a larger message,
then you could use up to 8K.
The socket address of the recipient must include the internet address of the computer and the agreed
port number. In Figure 2, the sendto call uses the address filled in before the binding.
2.4 Receiving a message
The recvfrom from system call receives a single message via a socket into its buffer and returns t h e
number of characters received.
int recvfrom(int s, char *buf, int len, int flags, struct sockaddr * from, int *fromlen)
Its use is illustrated in the receiver procedure of Figure 3.
/*receive two messages through port given as argument*/
void receiver(int port)
{
char message1[SIZE], message2[SIZE];
struct sockaddr_in mySocketAddress, aSocketAddress;
int s,aLength, n;
Figute 4 - Broadcasting
Sockets contain information as to their functionality - for example, whether they can be used to
transmit broadcasts. The usual default is not to be able to transmit broadcasts. The library function
setsockopt - set socket option - can be used to set the parameter of the socket to allow it to transmit
broadcasts. The procedure in Figure 4 will turn on broadcasting if necessary. See Figure 2 for an
example of where to insert a call to this procedure.
Figure 5 - server listens and accepts connection, and returns new socket
The server process binds its socket to a socket address as in datagram communication and then gets
ready to accept requests for connection. Note that the second argument to the socket system call is
given as SOCK_STREAM, to indicate that stream communication is required. If the third argument
is left as zero, the TCP protocol will be selected automatically.
The first stage is to use the listen system call to specify the maximum number of requests for
connections that can be queued at this socket. This number is usually set to five and means that if t h e
number of outstanding requests exceeds 5 they will be ignored. The second stage is to use the accept
system call to accept any connection that is requested. These two stages are shown in Figure 5.
Unix IPC 9
Figure 5 does not show the server closing the socket on which it listens. Normally a server would
first listen and then fork a new process to accept the connection and communicate with the client.
Meanwhile it will continue to listen in the original process.
void readAndWrite(int s, int amount) /*server receives amount bytes & sends it back*/
{
int n, nRead;
char buf[SIZE], *p = buf;
nRead = 0;
if(amount>SIZE)perror("Amount too much");
while (nRead < amount) {
if((n = read(s, p, amount-nRead))<0) perror("Receive");
else if(n==0) break;
else{
p += n;
nRead += n;
}
}
if((n = write(s, buf, nRead) )< 0) perror("Send2 failed\n");
else printf("wrote %d\n",n);
close(s);
}
The accept system call acts rather like a combination of socket and bind - it returns a socket
descriptor for a new socket bound with both local and remote addresses.
Figure 7 - client requests connection and sends and receives messages via it
Figure 8 shows the sequence of events at client and server that lead up to the establishment of a
connection. Note that both processes opens sockets and bind them. The server listens and offers to
accept connections via its socket s2. The client requests a connection via its socket s1. The accept
succeeds and a new socket s3 is opened for the connection to s1.
kylie: 192.135.233.215
computer
agred port
any port s2
server
client connection process
s1
process s3
s1 = socket(...)
bind(s1, to any port number on local computer);
connect(s1, server socket address)
The write system call takes arguments specifying a socket descriptor, a message and its length. The
write call is similar to the write call for files. It specifies a message to be sent to a socket address. I t
hands the message to the underlying TCP and IP protocols and returns the actual number of
characters sent.
The read system call receives some characters in its buffer and returns the number of characters
received. The connection behaves like a stream - any available data is read immediately, in t h e
same sequence as it was written by the client write calls. There is no indication of message
boundaries.
Unix IPC 11
server client
s2 = startUp(AGREED_PORT)
s3=acceptConnection(s)
s1 = requestConnection("it063", AGREED_PORT)
readAndWrite(s3, 10) writeAndRead(s1, "hello", "there", 5, 5)
close(s2)
Figure 9 illustrates how to start server and client processes that respectively call the procedures in
Figure 6 and 7. In this example, the client process sends two messages; the server process can receive
both the messages in the same read. On the other hand the read call may not get all the characters
written at the first read. You have to call read repeatedly until you have read sufficient
characters.
4 Conclusions
These notes have described the Unix IPC primitives.
If you need to implement clients and servers you should normally use remote procedure calling (RPC)
or remote method invocation - it does all the work of unpacking the procedure name and arguments,
selecting the procedure to be executed and returning the results. RPC facilities in Unix may be
constructed in a layer above the Unix IPC system functions.
There are some situations in which you may need to use message passing, in which case IPC may be
more appropriate.
The programs given in these notes are in an accompanying file.