Communication in Distributed Systems
Communication in Distributed Systems
Communication mechanisms
Page 1 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
Page 2 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
OFFSET
This integer indicates the offset of the user data within the segment. This field is only required as the
number of bits used in the OPTIONS field can vary
URGENT POINTER
This field can be initialized to point to a place in the user data where urgent information such as
escape codes etc. are placed. Then the receiving host can process this part immediately when it
receives the segment.
The Internet Protocol layer in the TCP/IP protocol stack is the first layer that introduces
the virtual network abstraction that is the basic principle of the Internet model. All
physical implementation details (ideally even though this is not quite true) are hidden
below the IP layer. The IP layer provides an unreliable, connectionless delivery system.
The reason why it is unreliable stem from the fact the protocol does not provide any
functionality for error recovering for datagrams that are either duplicated, lost or arrive
to the remote host in another order than they are send. If no such errors occur in the
physical layer, the IP protocol guarantees that the transmission is terminated
successfully.
Page 3 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
The LENGTH field is the length of the user datagram including the header, that is the
minimum value of LENGTH is 8 bytes. The SOURCE PORT and DESTINATION PORT are the
connection between an IP-address and a process running on a host. A network port is
normally identified by an integer. However, the user datagram does not contain any IP-
address
1. Synchronization:
Exchange of data is done synchronously which means it has a single clock pulse.
2. Message Passing:
When processes wish to exchange information. Message passing takes several forms
such as: pipes, FIFO, Shared Memory, and Message Queues.
Page 4 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
calls to transmit the data which means the sender doesn’t wait from the receiver
acknowledgment.
3. Message Destination:
A local port is a message destination within a computer, specified as an integer. A
port has exactly one receiver but many senders. Processes may use multiple ports
from which to receive messages. Any process that knows the number of a port can
send the message to it.
4. Reliability:
It is defined as validity and integrity.
5. Integrity:
Messages must arrive without corruption and duplication to the destination.
6. Validity:
Point to point message services are defined as reliable, If the messages are
guaranteed to be delivered without being lost is called validity.
7. Ordering:
It is the process of delivering messages to the receiver in a particular order. Some
applications require messages to be delivered in the sender order i.e. the order in
which they were transmitted by the sender.
It further aims at hiding most of the intricacies of message passing and is idle for client-
server application.
RPC allows programs to call procedures located on other machines. But the procedures
‘send’ and ‘receive’ do not conceal the communication which leads to achieving access
transparence in distributed systems.
Information can be transported in the form of parameters and can come back in
procedure result. No message passing is visible to the programmer. As calling and called
procedures exist on different machines, they execute in different address spaces, the
Page 5 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
parameters and result should be identical and if machines crash during communication,
it causes problems.
For a call of a program, an empty stack is present to make the call, the caller pushes
the parameters onto the stack (last one first order). After the read has finished running, it
puts the return values in a register and removes the return address and transfers
controls back to the caller. Parameters can be called by value or reference.
Call by Value: Here the parameters are copied into the stack. The value
parameter is just an initialized local variable. The called procedure may
modify the variable, but such changes do not affect the original value at
the calling side.
The client and server may also use different data representations even
for simple parameters. Stubs are used to perform the conversion of the
parameters, so a Remote
Page 6 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
Function Call looks like a local function call for the remote computer. For transparency of
RPC, the calling procedure should not know that the called procedure is executing on a
different machine.
program.
Client Stub: Used when read is a remote procedure. Client stub is put
into a library and is called using a calling sequence. It calls for the local
operating system. It does not ask for the local operating system to give
data, it asks the server and then blocks itself till the reply comes.
Server Stub: when a message arrives, it directly goes to the server stub.
Server stub has the same functions as the client stub. The stub here
unpacks the parameters from the message and then calls the server
procedure in the usual way.
1. The client procedure calls the client stub in the normal way.
2. The client stub builds a message and calls the local operating system.
5. The server stub unpacks the parameters and calls the server.
Page 7 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
6. The server does the work and returns the result to the stub.
7. The server stub packs it in a message and calls its local as.
10. The stub unpacks the result and returns to the client.
TRANSPARENCY OF RPC
A major issue in the design of an RPC facility is its transparency property. A transparent
RPC mechanism is one in which local procedures and remote procedures are (effectively)
indistinguishable to programmers. This requires the following two types of
transparencies:
The calling process is suspended until the called procedure returns. The caller can pass
arguments to the called procedure (remote procedure). The called procedure (remote
procedure) can return results to the caller.
Unfortunately, achieving exactly the same semantics for remote procedure calls as for
local procedure calls is close to impossible. This is mainly because of the following
differences between remote procedure calls and local procedure calls.
1. Unlike local procedure calls, with remote procedure calls the called
procedure is executed in an address space that is disjoint from the calling
program’s address space. Due to this reason, the called (remote) procedure
cannot have access to any variables or data values in the calling program’s
environment. Thus, in the absence of shared memory, it is meaningless to
pass addresses in arguments, making call-by-reference pointers highly
unattractive. Similarly, it is meaningless to pass argument values containing
pointer structures (e.g., linked lists), since pointers are normally
represented by memory addresses.
Page 8 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
According to Bal et al. [1989] dereferencing a pointer passed by the caller has to be done
at the caller’s side, which implies extra communication. An alternative implementation is
to send a copy of the value pointed at the receiver, but this has subtly different
semantics and may be difficult to implement if the pointer points into the middle of a
complex data structure, such as a directed graph. Similarly, call by reference can be
replaced by copy in / copy out, but at the cost of slightly different semantics.
2. Remote procedure calls are more vulnerable to failure than local procedure
calls, since they involve two different processes and possibly a network and
two different computers. Therefore, programs that make use of remote
procedure calls must have the capability of handling even those errors that
cannot occur in local procedure calls. The need for the ability to take care
of the possibility of processor crashes and communication problems of a
network makes it even more difficult to obtain the same semantics for
remote procedure calls as for local procedure calls.
3. Remote procedure calls consume much more time (100 – 1000 times
more) than local procedure calls. This is mainly due to the involvement of a
communication network in RPCs. Therefore, applications using RPCs must
also have the capability to handle the long delays that may possibly occur
due to network congestion.
Because of these difficulties in achieving normal call semantics for remote procedure
calls, some researchers feel that the RPC facility should be nontransparent. For example,
Hamilton [1984] argues that remote procedures should be treated differently from local
procedures from the start, resulting in a nontransparent RPC mechanism. Similarly, the
designers of RPC were of the opinion that although the RPC system should hide low-level
details of message passing from the users, failures and long delays should not be hidden
from the caller. That is, the caller should have the flexibility of handling failures and long
delays in an application – dependent manner. In conclusion, although in most
environments total semantic transparency is impossible, enough can be done to ensure
that distributed application programmers feel comfortable.
Page 9 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
abstraction by concealing from programs the interface to the underlying RPC system. We
saw that an RPC involves a client process and a server process. Therefore, to conceal the
interface of the underlying RPC system from both the client and server processes, a
separate stub procedure is associated with each of the two processes. Moreover, to hide
the existence and functional details of the underlying network, an RPC communication
package (known as RPCRuntime) is used on both the client and server sides. Thus,
implementation of an RPC mechanism usually involves the following five elements of
program [Birrell and Nelson 1984].
1. The client
3. The RPCRuntime
5. The server
The interaction between them is shown in Figure 4.2. The client, the client stub,
and one instance of RPCRuntime execute on the client machine, while the server, the
server stub, and another instance of RPCRuntime execute on the server machine. The job
of each of these elements is described below.
Page 10 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
34
Client Server
Execute
Retur
Return Call Call n
Unpac Pac
k Pack Unpack k
RPCRuntime RPCRuntime
Wait
Rece
ive Send Receive Send
Page 11 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
Call packet
Result packet
Client: The client is a user process that initiates a remote procedure call. To make a
remote procedure call, the client makes a perfectly normal local call that invokes a
corresponding procedure in the client stub.
Client Stub: The client stub is responsible for carrying out the following two tasks : On
receipt of a call request from the client, it packs a specification of the target
procedure and the arguments into message and then asks the local RPCRuntime to send
it to the server stub.
On receipt of the result of procedure execution, it unpacks the result and passes it to the
client.
RPCRuntime:
The RPCRuntime handles transmission of messages across the network between client
and server machines. It is responsible for retransmissions, acknowledgements, packet
routing, and encryption. The RPCRuntime on the client machine receives the call request
message from the client stub and sends it to the server machine. It also receives the
message containing the result of procedure execution from the server machine and
passes it to the client stub.
On the other hand, the RPCRuntime on the server machine receives the message
containing the result of procedure execution from the server stub and sends it to the
client machine. It also receives the call request message from the client machine and
passes it to the server stub.
Server Stub: The job of the server stub is very similar to that of the client stub. It
performs the following two tasks:
On the receipt of the call request message from the local RPCRuntime, the server stub
unpacks it and makes a perfectly normal call to invoke the appropriate procedure in the
server.
On receipt of the result of procedure execution from the server, the server stub packs
the result into a message and then asks the local RPCRuntime to send it to the client
stub.
Server: On receiving a call request from the server stub, the server executes the
appropriate procedure and returns the result of procedure execution to the server stub.
Note here that the beauty of the whole scheme is the total ignorance on the part of the
client that the work was done remotely instead of by the local kernel. When the client
gets control following the procedure call that it made, all it knows is that the results of
Page 12 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
the procedure execution are available to it. Therefore, as far as the client is concerned,
remote services are accessed by making ordinary (local) procedure calls, not by using the
send and receive primitives. All the details of the message passing are hidden in the
client and server stubs, making the steps involved in message passing invisible to both
the client and the server.
Page 13 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
RPC MESSAGES
Any remote procedure call involves a client process and a server process that are possibly
located on different computers. The mode of interaction between the client and server is
that the client asks the server to execute a remote procedure and the server returns the
result of execution of the concerned procedure to the client. Based on this mode of
interaction, the two types of messages involved in the implementation of an RPC system
are as follows:
1. Call messages that are sent by the client to the server for requesting
execution of a particular remote procedure.
2. Reply messages that are sent by the server to the client for returning the
result of remote procedure execution.
The protocol of the concerned RPC system defines the format of these two types of
message. Normally, an RPC protocol is independent of transport protocols. That is, RPC
does not care how a message is passed from one process to another. Therefore, an RPC
protocol deals only with the specification and interpretation of these two types of
messages.
Call Messages:
In addition to these two fields, a call message normally has the following fields.
Page 14 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
38
messages and duplicate messages in case of system failures and for properly matching
reply messages to outstanding call messages, especially in those cases when the replies
of several outstanding call messages arrive out of order.
4. A message type field that is used to distinguish call messages from reply
messages. For example, in an RPC system, this field may be set to 0 for all
call messages and set to 1 for all reply messages.
5. A client identification field that may be used for two purposes – to allow
the server of the RPC to identify the client to whom the reply message has
to be returned and to allow the server to check the authentication of the
client process for executing the concerned procedure.
Thus, a typical RPC all message format may be of the form shown in Figure 3.2.
Reply Messages:
When the server of an RPC receives a call message from a client, it could be faced
with one of the following conditions. In the list below, it is assumed for a particular
condition that no problem was detected by the server for any of the previously listed
conditions:
Messag Reply
e Message
Identifi
er type status Result
(successful)
(a)
(b)
Fig. 3.3 A typical RPC reply message format : (a) a successful reply message format; (b)
an unsuccessful reply message format
Page 15 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
39
1. Taking the arguments (of a client process) or the result (of a server process)
that will form the message data to be set to the remote process.
2. Encoding the message data of step 1 above on the sender’s computer. This
encoding process involves the conversion of program objects into a stream
form that is suitable for transmission and placing them into a message
buffer.
The marshaling process must reflect the structure of all types of program objects
used in the concerned language. These include primitive types, structured types, and
user defined types. Marshaling procedures may be classified into two groups :
Page 16 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
40
2. Those that are defined by the users of the RPC system. This group contains
marshaling procedures for user – defined data types and data types that
include pointers. For example, in Concurrent CLU, developed for use in the
Cambridge Distributed Computer System, for user-defined types, the type
definition must contain procedures for marshaling.
A good RPC system should always generate in-line marshaling code for every
remote call so that the users are relieved of the burden of writing their own marshaling
procedures. However, practically it is difficult to achieve this goal because of the
unacceptable large amounts of code that may have to be generated for handling all
possible data types.
In RPC based applications, two important issues that need to be considered for
every management are server implementation and server creation.
Server Implementation :
Based on the style of implementation used, servers may be of two types : stateful
and stateless.
Stateful Servers:
A stateful server maintains clients’ state information from one remote procedure
call to the next. That is, in case of two subsequent calls by a client to a stateful server,
some state information pertaining to the service performed for the client as a result of
the first call execution is stored by the server process. These clients’ state information is
subsequently used at the time of executing the second call.
For example, let us consider a server for byte-stream files that allows the
following operations on files :
Open (filename, mode) : This operation is used to open a file identified by filename in
the specified mode. When the server executes this operation, it creates an entry for this
file in a file-table that it uses for maintaining the file state information of all the open
files. The file state information normally consists of the identifier of the file, the open
mode, and the current position of a nonnegative integer pointer, called the read write
pointer. When a file is opened, its read-write pointer is set to zero and the server returns
to the client a file identifier (fid), which is used by the client for subsequent accesses to
that file.
Page 17 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
41
Read (fid, n, buffer) : This operation is used to get n bytes of data from the file identified
by fid into the buffer named buffer. When the server executes this operation, it returns
to the client n bytes of file data starting from the byte currently addressed by the read –
write pointer and then increments the read – write pointer by n.
Write (fid, n, buffer) : On execution of this operation, the server takes n bytes of data
from the specified buffer, writes it into the file identified by fid at the byte position
currently addressed by the read
Seek (fid, position ) : This operation causes the server to change the value of the read
write pointer of the file identified by fid to the new value specified as position.
Close (fid) : This statement causes the server to delete from its file table the file state
information of the file identified by fid.
The file server mentioned above is stateful because it maintains the current state
information for a file that has been opened for use by a client. Therefore, as shown in Fig.
3.3, after opening a file, if a client makes two subsequent Read (fig, 100, buf), calls, the
first call will return the first 100 bytes (bytes 0 – 99) and the second call will return the
next 100 bytes (bytes 100 – 199).
File table
f R/W
i Mode
. . .
Read (fid, 100, buf) . . .
. . .
Return (bytes 0 to 99)
Page 18 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
To keep track of the current record position for each client that has opened the file for
accessing. Therefore to design an idempotent interface for reading the next record from the
file, it is important that each client keeps track of its own current record position and the
server is made stateless, that is, no client state should be maintained on the server side.
Based on this idea, an idempotent procedure for reading the next record from a sequential
file is
ReadRecordN (Filename, N)
which returns the Nth record from the specified file. In this case, the client has to correctly
specify the value of n to get desired record from the file.
It is clearly not idempotent since repeated execution will add further copies of the same
record to the file. This interface may be converted into an idempotent interface by using the
following two procedures instead of the one defined above :
GetLastRecordNo (Filename)
The first procedure returns the record number of the last record currently in the file, and the
second procedure writes a record at specified in the file. Now, for appending a record, the
client will have to use the following two procedures :
Page 19 of 20
GITONGA MUNYI ©2024 COMMUNICATION IN DISTRIBUTED SYSTEMS
Revision Exercise:
3) What is a stub? How are they generated? State their functionality and
purpose.
Page 20 of 20