DC (U1)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

IV YEAR VIIISEM Distributed Computing

UNIT I
Unit I Introduction: Characteristics, Examples, Applications, Challenges –System models:- Architectural
models and Fundamental models – Network principles and Internet protocols – Inter-process communication:
API, Marshalling ,Multicast communication, Client-server communication, Group communication.

1.1 INTRODUCTION TO DISTRIBUTED SYSTEMS


Definition
• A distributed System is a collection of independent computers that appeals to its users as a single coherent system.
• Important Characteristics of distributed System :
(i) In distributed systems, the differences between the various computers and the ways in which
they communicate are hidden from users. Internal Organization of distributed system also hidden
from the users.
(ii) Users and applications can interact with a distributed system in a consistent and uniform
way, regardless of where and when interaction takes place.
• A distributed system will normally be continuously available, although perhaps certain parts may be temporarily
out of order.
• The motivation for constructing and using distributed system is resource sharing.
• Resources such as printers, files, web pages or database records are managed by servers of appropriate type.
• The purpose of this chapter is to convey a clear view of the nature of distributed systems andthe challenge that must
be addressed in order to ensure that they are successful.
• We look at some key examples of distributed systems, the components from which they are
constructed and their purposes.
1.2 Examples of distributed Systems
• Let us now have a look at several examples of distributed systems. The internet is an example of distributed
system.
• The internet is an interconnected collection of computer networks of many different types. Figure 1.1 Illustrates the
typical portion of the Internet.

Figure 1.1 A typical portion of the internet

• It enables the users, wherever they are, to make use of services such as World Wide Web, e-mail and file transfer.
IV YEAR VIIISEM Distributed Computing

• The services of internet can be extended by addition of server and new type of services.
• The above figure shows a collection of intranets. Intranets are subnetwork operated by companies and other
organizations.
• ISPs (Internet Service Providers) are companies that provide Connections to individual users and small organizations.
• The intranets are linked together by back bones.
• A back bone is a network link with a high transmission Capacity. It employs satellite connections, fiber optical cables
and other high band width cables.
• Multimedia services are available in the internet. This service enable the users to access audio and video data.
• As a second example, consider a work flow information system that supports the automatic processing of
orders.
• Typically, such a system is used by people from several departments, at different locations.
• For example, people from the sales department may be spread across a large region or an entire country.
 Orders areplaced bymeans of laptop computers that are connected to the system through the telephone networks.
 Incoming orders are automatically forwarded to the planning department.
 The system will automatically forward orders to an appropriate and available person.
• As a final example, consider the world wide web. The web offers a simple, consistent, and uniform model of distributed
documents.
• To see a document, a user need merely activate a reference, and the document appears on the screen. Publishing a
document is also simple: you only have to give it a unique name in the form of uniform.
• Resource Locator that refers to a local file containing the document's content.
1.3. CHALLENGES
The key challenges faced by the designers of Distributed system are
 Heterogeneity.
 Openness.
 Security.
 Scalability.
 Failure handling.
 Concurrency.
 The need for transparency.
1.3.1 Heterogeneity
• They must be constructed from a variety of different networks, operating systems, computer hardware and
programming languages.
• The internet Communication protocols mask the difference in networks, and middleware can deal with other
differences.
• The term Middleware applies to a software layer that provides a programming abstraction as well as masking the
heterogeneity of the underlying networks, hardware, operating systems and programming languages.
Example: Common Object Request Broker Architecture (CORBA).
1.3.2 Openness
• The openness of a computer system is the characteristic that determines whether the system can be extended and re-
implemented in various ways.
• The openness of distributed systems is determined primarily by the degree to which new resource sharing services can
be added and be made available for use by a variety of client programs.
IV YEAR VIIISEM Distributed Computing
• The first step to provide openness is to publish the interfaces of the components, but the integration of components
written by different programmers is a real challenge.
• Open systems are characterized by the fact that their key interfaces are published.

• Open distributed systems are based the provision of a uniform communication mechanism and published interfaces for
access to shared resources.
• Open systems can be constructed from heterogeneous hardware and software.
1.3.3 Security
Security for information resources has three components:
(i) Confidentiality
Protection against disclosure to unauthorized individuals.
(ii) Integrity
Protection against alteration or corruption.
(iii) Availability
Protection against interference with the means to access the resources.
• The challenge is to send sensitive information in a message over a network in a secure manner.
• But security is not just a matter of concealing the contents of messages, it also involves knowing for sure the identity of
the user.
• The second challenge is to identify a remote user.
• Both of these challenges can be met by the use of encryption techniques.
• Encryption can be used to provide adequate protection of shared resource and to keep sensitive information secret when
it is transmitted in messages over a network.
• Denial of service attacks are still a problem.
1.3.4 Scalability
• A distributed system is scalable if the cost of adding a user is a constant amount in terms of the resources that must be
added.
• In other words, a system is described as scalable if it will remain effective when there is significant increase in the
number of resources and the number of users.
• The design of scalable distributed system present the following challenges: Controlling the cost of physical
resources.
• As the demand for a resource grows, it should be possible to extend the system, at reasonable cost, to meet it.
 Controlling the performance loss:
• Consider the management of a set of data whose size is proportional to the number of users or resources in
them.
• In this case, Algorithms that use hierarchic structures scale better than linear structures.
• But, even with hierarchic structures, an increase in size will result in some loss in performance.
 Size
• When a system needs to scale, we must consider scaling with respect to size.
• If more users or resources need be supported, we are confronted the limitations of centralized services, data and
algorithms.
• For example, many services are centralized in the sense that they are implemented by means of only a single server
running on specific machine in the distributed system.
• The problem now is that the server become a bottleneck as the number of users grows.
• Even if we have virtually unlimited processing and storage capacity, communication with that server will prohibit
further growth.
IV YEAR VIIISEM Distributed Computing
 Preventing software resources running out
• As an Example, the supply of available internet addresses is running out.
• There is no correct solution to this problem. It is difficult to predict the demand that will be put on a system years
ahead.
 Avoiding performance bottle necks
• Algorithms should be decentralized to avoid having performance bottlenecks.
• Because, in a large distributed system, enormous number of messages have to be routed over many lines.
• The optimal way to do this is collect complete information about the load on all machines and lines, and then run a
graph theory algorithm to compute all the optional.
• The trouble is that collecting and transporting all the Input and output information would again be a bad idea
because these messages world overload part of the network.
1.3.5 Failure handling
• Failures fall into two obvious categories: hardware and software.
• If failure occurs, program may produce incorrect results on they may stop before they have completed and
intended computation.
• Failures in a distributed system are partial, i.e., some components fail while others continue to function.
Therefore, the handling of failures is particularly difficult.
• The techniques dealing with failures are listed below.
Detecting failures - Some failures can be detected. E.g. checksums can be used to detect corrupted data in a message
or a file
Masking failures - Some failures that have been detected can be hidden or made less severe. E.g. messages can be
retransmitted when they fail to arrive.
Tolerating failures - It is not possible to detect and hide all the failures that might occur in large network. Their
clients can be designed to tolerate failures, which generally involve the users tolerating them as well. Services can be
made to tolerate failures by the use of redundant components.
• Recovery from failures - Recovery involves the design of software so that the state of permanent data can be recovered
or rolled back after a server has crashed.
1.3.6 Concurrency
• Both services and applications provide resources that can be shared by clients in a distributed system.
• There is therefore a possibility that several clients will attempt to access a shared resource at the same time.
• The process that manages a shared resource could take one client request at a time. But that approach limits
throughput.
• Therefore, services and applications generally allow multiple client requests to be processed concurrently.
• For an object to be safe in concurrent environment, its operations must be synchronized in such a way that it s data
remains consistent.
• This can be achieved by standard techniques such as semaphores.
1.3.7 Transparency
• It is defined as the concealment from the user and the application programmer of the separation of components in a
distributed system.
• Hence, the system is perceived as a whole rather than as a collection of independent components.
• The main aim of the transparency is to make certain aspects of distribution invisible to the application programmers.
Access transparency enables local and remote resources to be accessed using identical operations.
Location transparency enables resources to be accessed without knowledge of their location.
IV YEAR VIIISEM Distributed Computing
Concurrency transparency enables several processes to operate concurrently using shared resources. The resources wil
not interfere among themselves.
Replication transparency enables multiple instances of resources to be used to increase reliability and performance
without knowledge of the replicas by users or application programmers.
Failure transparency enables the concealment of faults. It allows users and application programs to complete their tasks
even though they have the failure of hardware or software components.

Mobility transparency allows the movement of resources and clients within a system. The movement will not affect
the operation of users or programs.
Performance transparency allows the system to be reconfigured to improve performance as loads vary.
Scaling transparency allows the system and applications to expand in scale.However
it will not change to the system structure or the application algorithms.
• There are two important transparencies available. They are access and location transparency; their presence or absence
most strongly affects the utilization of distributed resources. They are referred together as network transparency.
1.4 ARCHITECTURAL MODELS
• An architectural model of a distributed system is concerned with placement of its parts and the relationships between
them.
• An architectural model defines the way in which the components of systems interact with one another and the way in
which they are mapped onto an underlying network of computers.
• The overall goal is to ensure that the structure will meet present and likely future demands on it.
• Major concerns are to make the system reliable, manageable, adaptable and cost effective.
• An architectural model first simplifies and abstracts the functions of the individual components of a
distributed system and then it considers :
oThe placement of the components across a network of computers, seeking to definite useful patterns
for the distribution of data and workload.
• This classification identifies the responsibilities of each and hence helps us to assess their workloads and to
determine the impact of failures in each of them.
1.4.1 Software layers
• The term 'software architecture'refers to the structuring of software as layers or modules in a single computer and in
terms of services offered and requested between processes located in the same or different computers.
• This process and service-oriented view can be expressed in terms of service layers.

Figure: 1.3 Software and Hardware services


 Platform
• The lowest level hardware and software layers are known as a platform for distributed systems and applications.
• These low level layers provide services to the layers above them.
IV YEAR VIIISEM Distributed Computing
• They are implemented independently in each computer.
• They bring the system's programming interface upto a level that facilitates communication and coordination between
processes, eg., Intel X86/windows, Intel X86/Linux, Sun SPARC/Sun OS, Intel X86/Solaris, etc.
 Middleware
• Middleware is defined as a layer of software whose purpose is to mask heterogeneity and to provide convenient
programming model to application programmers.
• Middleware is concerned with providing useful building blocks for the construction of software components that can
work with one another in a distributed system.
• In particular, it raises the level of the communication activities of application programs through the support of
abstractions such as remote method invocation, communication between a group of
processes, notification of events, replication of shared data and transmission of multimedia data in real time.
1.4.2 System Architectures
In this section, let us discuss the principal architectural models on which distribution of responsibilities is
based. The main types of architectural models are given below.
 Client - Server model
In the basic client-server model, processes in a distributed system are divided into two groups.
i) A Server (ii) A Client oA server is a process implementing a specific service. For example, a file system service or a
database service.
oA client is a process that requests a service from a server by sending it a request and subsequently waiting for the
server's reply.
Figure 1.4 illustrates the simple structure in which client processes interact with individual server processes in separate
host computers in order to access the shared resources that they manage.

Figure 1.4: Clients invoke individual servers


• Servers may be clients of other servers. For example, a web server is a client of a local file server that manages the
files in which the web pages are stored.
• Web servers and other Internet services are clients of the DNS service, which translates Internet Domain Names to
network addresses.
 Services provided by multiple Servers:
• Services may be implemented as several server processes in separate host computers interacting as necessary to
provide a service to client processes as shown in Figure 1.5.
• The servers may partition the set of objects based on the service and distribute them between themselves, or they may
maintain replicated copies of them on several hosts.
• The web provides a common example of partitioned data in which each web server manages its own set of resources.
• Replication is used to increase performance and availability and to improve fault tolerance.
• It provides multiple consistent copies of data in processes running in different computers.
IV YEAR VIIISEM Distributed Computing

Figure 1.5: A service provided by multiple servers

 Proxy servers and caches


• A cache is a store of recently used data objects.
• When a new object is received at a computer it is added to the cache store, replacing some existing objects, if
necessary.
• When an object is needed by a client process, the caching service first checks the cache and supplies the object from
there if an up-to-date copy is available.
• If not, an up-to-date copy is fetched from the server. Caches may be co-located with each client or in a proxy server
that can be shared by several clients.
• Web proxy servers provide a shared cache of web resources for the client machines at a site or across several sites.
The purpose of proxy servers is to increase availability and performance of the service by reducing the load on the
WAN and web servers.

Figure 1.6: Web proxy Server


 Peer Processes
 In this architecture, all of the processes play similar roles, interacting cooperatively as peers to perform a distributed
activity without any distinction between clients and servers.
 In this model, code in the peer processes maintains consistency of application-level resources and synchronizes
application-level actions when necessary.
 The elimination of server processes reduces inter-process communication delays for access to local objects.
IV YEAR VIIISEM Distributed Computing

1.4.3 Variations on the client-server model


Several variations on the client-server mode' can be derived from the consideration of the following factors:
The use of mobile code and mobile agents
OUser's need low cost computers with limited hardware resource that are simple tomanage.
oThe requirement to add and remove mobile devices in a convenient manner.
 Mobile code
• Applets are a well-known and widely used example of mobile code.
• The user running a browser selects a link to an applet whose code is stored on a web server, the code is downloaded
to the browser and runs there.
• An advantage of running the downloaded code locally is that it can give good interactive response since it does not
suffer from delays or variability of band width associated with network communication.
 Mobile agents
• Amobile agent is a running program including both code and data that travels from one computer
to another m a network carrying out a task on someone's behalf such as collecting information,
eventually returning with the result.
• Mobile agents might be used to install and maintain software on the computers within an
organization or to compare the prices of products from a number of vendors by visiting the site of each vendor
and performing a series of database operations.
• Mobile agents are a potential security threat to the resources in computers that they visit.
• The environment receiving a mobile agent should decide on which of the local resources it should be allowed to use,
based on the identity of the user on whose behalf the agent is acting.
 Thin clients
• The term thin clients refers to a software layer that supports a window- based user interface on a computer that is
local to the user while executing application programs on a remote computer.
• Instead of downloading the code of applications into the user's computer, it runs them on a compute server(a
powerful computer that has the capacity to run large numbers of applications simultaneously).
IV YEAR VIIISEM Distributed Computing
• The main drawback of the thin client architecture is in highly interactive graphical activities such as CAD,
and image processing, where the delays experienced by users are increased by the need to transfer image and vector
information between the thin client and the application process, incurring both network and operating system
latencies.
 Mobile devices and spontaneous networking 
• The world is increasingly populated by small and portable computing devices, including laptops, handheld devices
such as PDAS, mobile phone and digital cameras, wearable computers such as smart watches, and devices embedded
in everyday appliances such as washing machines.
• With appropriate integration into our distributed systems, these devices provide support for mobile computing,
whereby users carry their mobile devices between network environments and take advantage of local and remote
services as they do so.
• The form of distribution that integrates mobile devices and other devices into a given network is best described by the
term spontaneous networking.
The key features of spontaneous networking are
 Easy Connection to a local network - wireless links avoid the need for pre-installed cabling and avoid the
inconvenience and reliability issues surrounding plugs and sockets.
A device brought into a new network environment is transparently reconfigured to obtain connectivity.
 Easy integration with local services: Devices that find themselves inserted into existing networks of devices
automatically discover the services that are provided there, with no special configuration actions performed by the user.
 Limitations of spontaneous networking
• Internet addressing and routing algorithms are difficult to implement
• Limited connectivity - users are not always connected as they more around (eg. through tunnels)
• Security and Privacy
 Discovery services
• Spontaneous networking allows client processes running on portable devices to access services on the networks to
which they are connected.The clients may not know about the services that are available in the network to which they
are connected.
• The purpose of a discovery system is to accept and store details of services that are available on the network and to
respond to queries from clients about them.
A discovery service offers two interfaces:
 A registration service - accepts registration requests from servers and records the details that they contain in the
discovery service's database of currently available services.
 A lookup service - accepts queries concerning available services and searches its database for registered services that
match the queries.
oThe result returned includes sufficient details to enable clients to select between several similar services based on
their attributes and to make a connection to one or more of them.
IV YEAR VIIISEM Distributed Computing

 Network Computers
The OS and application software for desktop computers typically require much of the active code and data to be
located on a local disk.
• But the management of application files and the maintenance of a local software base require considerable technical
effort of a nature that most users are not qualified to provide.
• The network computer is a response to this problem.
• It downloads its operating system and any application software needed by the user from a remote file server.
• Applications are run locally but the files are managed by a remote file server.
Since all the application data and code is stored by a file server, the users may migrate from one network computer to
another.
1.5 FUNDAMENTAL MODELS
• Fundamental models are concerned with a more formal description of the properties that are common in all of
the architectural models.
• All communication between processes is achieved by means of messages.
• Message communication over a computer network can be affected by delays. It may suffer from a variety of failures.
• It is vulnerable to security attacks. These issues are addressed by three models.
• The interaction modeldeals with performance and with the difficulty of setting time limits in a distributed system.
• The failure modelattempts to give a precise specification of the faults that can be exhibited by processes and
communication channels.
• The security modeldiscusses the possible threats to processes and communication channels. It introduces the concept
of a secure channel, which is secure against these threats.
1.5.1 Interaction model
• Interacting processes perform all activities in a distributed system.
• Each process has its own state, consisting of the set of data that it can access and update, including the variables in its
program.
• The state belonging to each process is completely private, i.e., it cannot be accessed or updated by any other process.
• Two significant factors affecting interacting processes in a distributed system are:
• Communication performance is often a limiting characteristic.
• It is impossible to maintain a single global notion of time.
 Performance of communication Channels
Communication over a computer network has the following performance characteristics relating to latency, bandwidth
and jitter.
IV YEAR VIIISEM Distributed Computing
• Delay between the start of a message is transmission from one process and the beginning of its receipt by another is
referred to as latency, the latency includes:
o The delay in accessing the network.
o The time taken by the OS communication services which may vary according to the load
on the OS.
• Bandwidthis the total amount of information that can be transmitted in a given time. When a large number of
communication channels are using the same network, they have to share the available bandwidth.
• Jitter is the variation in the time taken to deliver a series of messages. Jitter is relevant to multimedia data. E.g., if
consecutive samples of audio data are played with different time intervals then the sound will be badly distorted.
 Computer clocks and timing events
• Each computer in a distributed system has its own internal clock, which can be used by local processes to
obtain the value of the current time.
• Therefore, two processes running on different computers can associate timestamps with their events.
• However, even if two processes read their clocks at the same time, their local clocks may supply different time values.
• This is because computer clocks drift from perfect time and their drift rates differ from one another.
• The term clock drift raterefers to the relative amount that a computer clock differs from a perfect reference clock.
• Even if the clocks on all computers are set to the same time initially, their clock would eventually vary quite
significantly unless corrections are applied.
 Two variants of the interaction model
In a distributed system it is hard to set time limits for process execution, message delivery or clock drift.
Two opposing extreme positions provide a pair of simple models, the first has a strong assumption of time and the
second makes no assumption about time.
(i) Synchronous distributed systems
It is defined as one in which the following bounds are specified:
 The time to execute each step of a process has known lower and upper bounds.
 Each message transmitted over a channel is received within a known bounded time.
 Each process has a local clock whose drift rate from real time has a known bound.
• It is possible to suggest likely upper and lower bounds for process execution time message delivery and clock drift
rates in a distributed system.
• But it is difficult to arrive at realistic values and to provide guarantees of the chosen values.
• Unless the values are guaranteed, any design based on the chosen values will not be reliable.
(ii) Asynchronous distributed system
• An asynchronous distributed system is one in which there are no bounds on process execution speeds,
message transmission delays and clock drift rates.
• This exactly models the Internet, in which there is no intrinsic bound on server or network load.
• For example, to transfer a file using ftp. Actual distributed systems are very often asynchronous because of the need for
processes to share the processors and for communication channels to share the network.
1.5.2 The failure model
The failure model defines the ways in which failure may occur in order to provide an understanding of the effects of
failures.
There are three types of failures andare explained below:
 Omission failures: The faults classified as omission failures refer to cases when a process or communication
channel fails to perform actions that it is supposed to do.
There are two types of omission failures. They are:
IV YEAR VIIISEM Distributed Computing
 Process omission failures:
• The chief omission failure of a process is to crash.
• The design of services that can survive in the presence of faults can be simplified if it can be assumed that the
services on which it depends crash cleanly, i.e. the processes either function correctly or else stop.
• Other processes may be able to detect such a crash by the fact that the process repeatedly fails to respond to
invocation messages.
• However this method of crash detection relies on the use of timeouts.
• A process crash is called fail-stop if other processes can detect certainly that the process has crashed.
 Communication omission failures
• As an example, consider the two communication primitives namely send and receive. A process P performs a
send by inserting the message m in its outgoing message buffer.
• The communication channel transports m to q ' s incoming message buffer.
• Process q performs a receive by taking m from its incoming message buffer and then delivers it.

• The outgoing and incoming message buffers are typically provided by the operating system.
• The communication channel produces an omission failure if it does not transport a message from p's outgoing
message buffer to q's incoming message buffer. This is known as 'dropping messages'and is generally caused by
lack of buffer space.
The loss of message between the sending process and the outgoing message buffer is called sending omission
failure.
• The loss of message between the incoming message buffer and the receiving process is called receive-omission
failureand the loss of messages in between is called channel omission failure
 Arbitrary failures
• The term arbitrary or Byzantine failure is used to describe the worst possible failure semantics, in which any
type of error may occur.
• An arbitrary failure of a process is one in which it arbitrarily omits intended processing steps or takes unintended
processing steps.
• Communication channels can suffer from arbitrary failures, e.g. Message contents may be corrupted or non-
existent messages may be delivered or real messages may be delivered more than once.
 Timing failures
• Timing failures are applicable in synchronous distributed system where time limits are set for all operations.
IV YEAR VIIISEM Distributed Computing
Class of failure Affects Description
Clock Process Local clock exceeds the bounds

Performance Process Process exceeds the bounds

Performance Channel Message transmission takes longer

• Timing is particularly relevent to multimedia computers with video and audio channels. Video information may
require a large amount of data to be transferred.
 To deliver such information without timing failures will make demands on the operating system as well as
communication system.
 Masking failure
• Each component in a distributed system is generally constructed from a collection of other components.
• It is possible to construct reliable services from components that exhibit failures.
For example, multiple servers that hold replicas of data can continue to provide a service when one of
them crashes.
• A Knowledge of the failure characteristics can enable a new service to be designed to mask the failure of the
components on which it depends.
• A service masks a failure, either by hiding it altogether or by converting it into a more acceptable type
of failure
1.5.3 Security Model
• The security of a distributed system can be achieved by securing the processes and the channels used for their
interactions and by protecting the objects that they encapsulate against unauthorized access.
 Protecting objects
• Protection is described in terms of objects.
• Objects are intended to be used in different ways by different users. For example, some objects may hold a user's
private data and other objects may hold shared data such as Web pages.
• To support this, access rights specify who is allowed to perform the operations (read/write) of an object.
• The server is responsible for verifying the identity of the client behind each invocation and checking their
access rights on the requested object.
• The clients in turn can check the authenticity of the server.

Fig : Objects and pincipals

 Securing processes and their interactions


IV YEAR VIIISEM Distributed Computing
• Distributed systems are deployed and used in tasks that are likely to be subject to external attacks by hostile
users.
• This is true for applications such as financial transactions.
For these applications, secrecy or integrity is crucial. Integrity is threatened by security violations and
communication failures.
• In order to identify and defeat threats we have to analyze those threats.
• We will explsore a model for the analysis of security threats in the following paragraph.
• Processes interact by sending messages.
• The messages are exposed to attack because the network and the communication service are open.
• Servers and peer processes expose their interfaces, enabling invocations to be sent to them by any other
process.
• To secure the messages passed over the network, cryptographic algorithms (encryption) is applied to the
message, to provide confidentiality, authentication and integrity.
• The threats from a potential enemy are discussed under the following headings
 Threats to processes
• A process that is designed to handle incoming requests may receive a message from any other process in the
distributed system and it cannot necessarily determine the identity of thesender.
• This lack of reliable knowledge of the source of a message is a threat to the correct functioning of both servers
and clients. E.g. Spoofing
 Threats to communication channels
• An enemy can copy, alter or inject messages as they travel across the network and its intervening gateways.
• Such attacks present a threat to the privacy and integrity of information as it travels over the network and to the
integrity of the system.
• All these threats can be defeated by the use of secure channels, which is described below:
 Secure Channels
• A secure channel ensures the privacy and integrity of the data transmitted across it.
• Encryption and authentication are used to build secure channels as a service layer on top of existing
communication services.
• A secure channel is a communication channel connecting a pair of processes with the following
properties
• Each of the processes knows reliably the identity of other processes.
• It ensures the privacy and integrity of the data transmitted across it.
• Each message includes a physical or logical time stamp to prevent messages from being replayed or
reordered.
 Denial of service
This is a form of attack in which the enemy interferes with the activities of authorized users by
making excessive invocations on services or message transmission in a network, resulting in
overloading of physical resources.
IV YEAR VIIISEM Distributed Computing

 Mobile code
• Mobile code raise new and interesting security problems for any process that receives and executes
program code from elsewhere, such as the e-mail attachments, such code may easily play a Trojan
horse role.
Drawbacks of security techniques
• The use of security techniques such as encryption and access control incurs substantial
processing overhead and management costs.
1.6 INTER PROCESS COMMUNICATION
1.6.1 Introduction
• Inter process communication is concerned with the communication between processes in a distributed
system, both in its own right and as support for communication between distributed objects.
• The Java API for inter process communication in the internet provides both datagram and stream
communication.
• The Application Program Interface (API) to UDP provides a message passing abstraction - the simplest form
of interprocess communication.
• This enables a sending process to transmit a single message to a receiving process. The independent packets
containing these messages are called datagrams.
• In the Java and UNIX APIs, the sender specifies the destination using a socket - an indirect reference to a
particular port used by the destination process at a destination computer.
• The API to TCP provides the abstraction of a two-way stream between pairs of processes. The information
communicated consists of a stream of data items with no message boundaries.
• Request-reply protocols are designed to support client server communication in either Remote Procedure
Call(RPC) or Remote Method Invocation(RMI).
• Group multicast protocols are designed to support group communication. Group multicast is a form of
communication in which one process in a group of processes transmits the message to all members of the
group.
1.6.2 The API for the Internet Protocol 1.1
 Characteristics of Interprocess Communication
• Message passing between a pair of processes can be supported by two message communication operations
send and receive.
• In order for one process to communicate with another, one process sends a message to destination and another
process at the destination receives the message.
• This activity involves communication of data from the sending process to the receiving process and may
involve synchronization of the two processes.
• A queue is associated with each message destination.
• Sending processes cause messages to be added to remote queues and receiving processes remove messages
from local queues, communication between sending and receiving processes may be either synchronous or
asynchronous.
• In synchronous form of communication, the sending and receiving process synchronize, query message.
• In this case, both send and receive are blocking operations.
• Whenever a send issued the sending process is blocked until the corresponding receive is issued.
IV YEAR VIIISEM Distributed Computing
• Whenever receive is issued, the process blocks until a message arrives.
• In asynchronous form of communication, the use of send operation is non-blocking.
• The sending process is allowed to proceed as soon as the message have been copied to a buffer and the
transmission of the message proceeds in parallel with the sending process.
• Receive operation can have blocking and non-blocking variants.
• Messages are sent to(Internet address,localport)pairs. A local port is a message information within a
computer, specified as an integer.
• A port has exactly one receiver but can have many senders.
• Processes may use multiple ports from which to receive messages.
• Servers generally publicize their port numbers for use by clients.

 Socket
• Both forms of communication( UDP and TCP) use the socket abstraction which provides end point for
communication between processes

• Inter-process communication consists of transmitting a message between a socket m one process and a socket
in another process.

• For process to receive messages,its socket must be bound to a local port and the Internet address of the
computer on which it runs.

• Processes may use the same socket for sending and receiving messages.

• Any process may make use of multiple ports to receive messages but a process cannot share ports with other
processes on the same computer.

• Processes using IP address are an exception in that they do share ports.

 UDP datagram communication


• A datagram sent by UDP is transmitted from a sending process to a receiving process without
acknowledgement or retries.

• If a failure occurs, the message may not arrive.

• To send or receive messages, a process must first create a socket bound to an Internet address of the localhost
and local port.

• A server will bind its socket to a server port-one that it makes known to clients so that they can send messages
to it.

• A client binds its socket to any free local port.

• Few issues relating to datagram communication are:

 Message Size
• The receiving process need to specify an array of bytes of a particular size.

• If the message is too big for the array,it is truncated on arrival.

• Any application requiring messages larger than the maximum must fragment them into chunks of that size.

 Blocking
• Sockets normally provide non-blocking sends and blocking receives for datagram communication.
 Timeouts
IV YEAR VIIISEM Distributed Computing
• The receive that blocks for ever is suitable for use by a server that is waiting to receive requests from its
clients.

• But in some program,it is no appropriate that a process that has used a receive operation should wait
indefinitely in situations where the potential sending process has crashed or the expected message has been
lost.

 Receive from any


• The receive method does not specify an origin for messages.
• Instead an invocation of receive gets a message addressed to its socket from any origin.
• A failure model for UDP datagram suffer from the following failures:
• Omission failures –Messages may be dropped occasionally.
• Ordering-Messages can sometimes be deliver out of sender order.

Use of UDP
• The Domain Name Service(DNS) which looks up DNS names in the Internet, is implemented over UDP.

• UDP datagrams are sometimes an attractive choice because they do not suffer from overheads associated with
guaranteed message delivery.

JAVA API for UDP datagram


The JAVA API provides datagram communication by means of two classes:
 DatagramPacket
 DatagramSocket

DatagramPacket
• This class provides a constructor that makes an instance out of an array of bytes comprising a message,the
length of the message and the Internet address and local port number of the destination socket.
• This class provides another constructor for receiving a message.
• Its arguments specify an array of byte to receive the message and its length.

Fig 1.9 Datagram Packet

Datagram Socket
• This class supports sockets for sending and receiving UDP datagrams.
• It provide a constructor that takes a port number as argument,for use by a processes that need to use particular
port.
• It also provides a non-argument constructor that allows the system to choose a free local port.
• The class Datagram Socket provides the following methods:
• send and receive: These methods are for transmitting datagrams between a pair of sockets.
SetSoTimeout:
This method allows a time out to be set.With a timeout set,the receive method will block for the time specified
and the throws an InterruptedIOException.
connect
This method is used for connecting it to a particular remote port and Internet address.

Program:UDPclient

import java.net.*;

import java.io.*;
IV YEAR VIIISEM Distributed Computing
public class UDPclient{
public static void main(String args[){

try{

Datagram Socket aSocket=new DatagramSocket();


Byte[ ]m = args[0 ].getBytes();

InetAddress aHost=InetAddress.getByName(args[1]);

int serverport =6789;


DatagramPacket request =new DatagramPacket(m,args[0].length(),aHost,serverport); aSocket.send(request);
byte[ ]buffer=new byte[1000];

DatagramPacket reply=new DatagramPacket(buffer,buffer.length);

aSocket.receive(reply);

 TCP stream Communication


The API to the TCP protocol provides the abstraction of a stream of bytes to which data may be writtenand
from which data may be read.
The following characteristics of the network are hidden by the stream abstraction.
Message size
The application can choose how much data it writes to a stream or reads from it.
Lost Messages
• The TCP uses an acknowledgement scheme.
• If the sender does not know the receive an acknowledgement within a timeout,it retransmits the message.
Flow Control
TCP attempts to match the speed of the processes that read from and write to a stream.

Message ordering and duplication:


Messgage identifies are associated with each IP packet,which enables the recipient to detect and reject
duplicates,or to reorder messages that do not arrive in sender order.

Message Destination
• A pair of communicating processes establishes a connection before they can communicate over a stream.
• Once a communication is established,the process simply read from and writes to the stream without the use
of internet,address and ports.
• The API for stream communication assumes that when a pair of processes are establishing a connection,one
of them plays the client role and the other plays the server role,but thereafter they would be peers.
• The client role involves creating a stream socket bound to any port and then making a connect request asking
for a connection to a server at its server port.
• The server role involves creating a listening socket bound to a serverport and waiting for clients to request
connections.The listening socket maintains a queue of incoming connection requests.
• When the server accepts a connection,a new stream socket is created for the server to communicate with a
client,meanwhile retaining its socket at the port for listening to other clients. Some outstanding issues
related to stream communication are

 Matching of data items


• Two communicating processes need to agree as to the contents of the data transmitted over a stream.
• E.g.,if one process writes an „int‟ followed by a double then the reader at the other end must read an
„int‟followed by a „double‟.

 Blocking
• When a process attempts to read data from an input channel,it will get data iron,the queue or it will block until
data becomes available.
• The process that writer data to stream may be blocked by the TCP flow control mechanism if the socket at the
other end is queuing as much data as the protocol allows.
IV YEAR VIIISEM Distributed Computing

 Threads
• When a server accepts a connection,it generally creates a new thread to communicate with the
new client.
 Failure Model
• To satisfy the integrity property, TCP use checksums to detect and reject corrupt packets and
sequence numbers to detect and reject duplicate packets.

• In order to satisfy the validity property, timeouts and retransmissions are used by TCP.
 Uses of TCP
• Many frequently used services run over TCP connection with reserved port numbers.
• Those include HTTP, FTP, Telnet and SMTP.

 JAVA API for TCP stream


The JAVA interface to TCP streams is provided in the classes ServerSocket and Socket.

 Server Socket
• This class is intended for use by a server to create a socket at a server port for listening for connect requests
from clients.
• Its accept method gets a connect request from the queue,or the queue is empty ,it blocks until one arrives.

 Socket
• The client uses this constructor to create socket specifying the DNS hostname and port of a server.
• The Socket class provides methods getInputStream and getOutputStream for accessing the two streams
associated with a Socket.

TCP client program


IV YEAR VIIISEM Distributed Computing

import java.net.*;
import java.io.*;
public class TCPClient
{
public static void manin( String args[ ])
{

try {
int serverport =7896;
Socket c=new Socket(args[1],serverport);
DataInputStream in =new DataInputStream(c.getInputStream());
DataOutputStream out =new
DataOutputStream(c.getOutputStrea()); out.writeUTF(args[0]);
String data=in.readUTF();
System.out.println(“Received:”+data);
}
Catch(UnknownHostException e)
{
System.out.printlnfsock:”+e.getMessage());
}
Catch(EOFException e)
{
System.out.println(“EOF:”+e.getMessage());
}
Catch(IOException e)
{
System.out.println(“IO:” +e.getMessage());
}
finally{if(c!=null) try{c.close();}catch(IOException
e);

• In the client program,the argument of the main method supply a message and the DNS name of the
server.
• The client creates a socket bound to the hostname and server port 7896.
• It makes DataInputStream and DataOutputStream then writes the message to its output stream and waitsto
read a reply from its input stream.UTF is an encoding that representing in a particular format.
• The server program opens a server socket on its serverport(7896) and listens for connect sets.
• When one arrives,it makes a new thread in whinch to communicate with the client.

 Server program
IV YEAR VIIISEM Distributed Computing

import java.net.*;
import java.io.*; public
class TCPServer{
public static void main(String args[ ])
{ try{ int
serverport=7896;
ServerSocket lissoc=new ServerSocket(serverport);
While(true)
{
Socket s =lissoc.accept();
DataInputStream out=new DataOutputStream(s.getOutputStream());
String line=in.read(JTF());
out.writeUTF(line);
}
catch(EOFException e) {
System.out.println(“EOF:”+e.getMessage());
}
catch(IOException e) {
System.out.println(“IO:”+e.getMessage();}
finally{try{lissoc.close();}
catch(IOException e){ }

1.7 EXTERNAL DATA REPRESENTATION AND MARSHALLING


• The information stored in running programs is represented as data structures whereas the information
inmessages consists of sequences of bytes.
• Irrespective of the form of communication used,the data structures must be flattened(converted tosequence
of bytes)before transmission and rebuilt on arrival.
• The representation of data items differs between architectures.
• Another issue is the set of codes used to represent characters for example,UNIX system use ASCIIcharacter
coding,taking one byte per character,whereas the Unicode standard allows for therepresentation of text in
many languages and takes two bytes per character.
• One of the following methods can be used to enable any two computers to exchange data values.
• The values are converted to n agreed external format before transmission and converted to the local formon
receipt.
• The values are transmitted in the sender‟s format,together with an indication of the format used and
therecipient converts the values if necessary.
• To support RMI(RempteMethodInvocation) or RPC(RemoteProcedureCall) any data type that can bepassed
as an argument or returned as a result must be able to be flattered and the individual primitivedata values
represented in an agreed format.
• An agreed standard for the representation of data structures and primitive values is called an externaldata
representation.

1.7.1 Marshalling and Unmarshaling


• Marshalling is the process of taking a collection of data items into a form suitable for Transmission in
amessage.
• Unmarshalling is the process of dissembling them on arrival to produce an equivalent collection of dataitems
at definition.thus marshalling consists of the generation of primitive values from their external
datarepresentation and the rebuilding of data structures.
Two alternative approaches to external data representation and marshaling are:
• CORBA‟s common data representation
• Java‟s object serialization

1.7.1.1 CORBA’s common data representation


IV YEAR VIIISEM Distributed Computing
• CORBA CDR is the external data representation defined with CORBA 2.0.
• CDR can represent all of the types that can be used as arguments and return values in remoteinvocations in
CORBA.
• It consists of 15 primitive types that include short(16 bit),long(32 bit),unsigned short,unsignedlong,float(32
bit),double(64 bit),char,Boolean(TRUE or FALSE),octet(8 bit) and any constructed types.
• Java objects can contain references to other object.
• When an object is serialized,the entire object that it references are serialized together with it.
• References are serialized as handles.The handle is a reference to an object within the serialized form.
• To serialize an object,its class information is written out,followed by the types and names of its
instancevariables.
• If the instance variables belong to new classes,then their class information must also be written outfollowed
by the types and names of their instance variables.
This recursive procedure continues until the class information and instance variables of the necessaryclasses have
been written out.
• Each class is given a handle and no class is written more than once to the stream of bytes-the handlesbeing
written where necessary.
• Eg: Person P=new Person(“Smith”,‟leaden‟,1934);
• The serialized form of the given example is shown H0 and H1 are handles

Fig.1.12. Serialized form of Person Object

• Primitive types are written in portable format using methods of ObjectOutputStream class.
• Strings and characters are written by its method called writeUTF(Universal Transfer Format)

1.7.1.2To serialize the object (e.g.Person)


• Create an in starve of the class ObjectoutputStrream and invoke its writeObject method by passing
theperson object as argument.
• To serialize an object from a stream of data, open an ObjectOutputstream on the stream and use
itsreadObject method to reconstruct the original object.
• Serialization and deserialization of the arguments and results of remote invocations are generally carriedout
automatically by the middleware, without any participation by the application programmer.

1.7.2 Remote Object References


• A remote object reference is an identifier for a remote object that is valid throughout a distributedsystem.
• Remote object reference passed in the invocation message to specify which object is to be invoked.
• Even after the remote object associated with a given remote object reference is deleted,it is importantthat the
remote object reference is not reused. Remote object reference can be constructed byconcatenating the
Internet address of its computer,port number of the process that created it with the timeof its creation and a
local object number.the local object number is incremented each time an object iscreated in that process.

Internet Address Port number Time Object Number Interface of remote


object
32- bits 32- bits 32 -bits 32- bits
Fig 1.13 Representation of a Remote Object Reference

1.8 CLIENT-SERVER COMMUNICATION


• This form of communication is designed to support the roles and messages exchanges in typical clientserver
interactions.
IV YEAR VIIISEM Distributed Computing
• In general,request-reply communication is synchronous because the client process blocks until the
replyarrives from the server.
• It can also be reliable because reply is effectively an acknowledgement to the client.
• The client-server exchanges messages in terms of send and receive operations in the JAVA API.
• A protocol built over datagrams avoids unnecessary overheads associated with TCP stream protocol.
1.8.1 The Request-Reply Protocol
• This protocol is based on three primitives: doOperation,getRequest and sendReply.
• It can be designed to provide certain delivery guarantees.
• If UDP datagrams are used, the delivery guarantees must be provided by the request-replyprotocol,which
may be use the server reply message as an acknowledgement of the client requestmessage.
• The doOperation method is used by client to invoke remote operations.its arguments specify the
remoteobject and which method to invoke,together with additional information required by the method.
• It is assumed that the client calling doOperation marshals arguments into an array of bytes andunmarshals
the results from the array of bytes that is returned.

Syntax
public byte[ ]doOperation(RemoteObjectref o,int methodid,byte[ ] arguments)

• The doOperation method sends a request message to the server whose Internet address j port arespecified in
the remote object reference given as argument.
• After sending the request message,doOperation invokes receive to get a reply message,from which itextracts
results and returns it to the caller.
• GetRequest is used by a server process to acquire service requests.
When the server has invoked the method in the specified object it then uses sendReply to send the
replymessage to client.
• When the reply message is received by the client,the doOperation is unblocked the execution of theclient
program continues.

Syntax
public byte[ ] getRequest();
public void sendReply(byte[ ] reply,InetAddress clientHost,int clientPort);

Fig1.14 Request-Reply Communication

Message Type Int(0-request,1-reply)


Request ID Int
Object Reference RemoteObjectref
Method Int or
IDArguments methodArray of
bytes
IV YEAR VIIISEM Distributed Computing
Fig.1.15 Request-Reply Structure

 Failure model of the request-reply protocol


• If the three primitive operations are implemented over L‟Dp datagrams they suffer from the
followingcommunication failures:
• They suffer from omission failures
• Messages are not guaranteed to be delivered in sender order
• To protocol can suffer from the failure of processes.
• To allow for the occasions when a server has failed or a request reply message is dropped.
• doOperation uses a timeout when it is waiting to get the server‟s reply message.

The action taken when a timeout occurs depends upon the delivery guarantees to be offered.
• The protocol is designed to recognize successive messages with the same request identifier and to filterout
duplicates.if the server has already sent the reply when it receives a duplicate request it will need toexecute
the operation again to obtain the result.
• Some servers can execute their operations more than once and obtain the same results each time.
• An idempotent operation is one that can be performed repeatedly with the same effect as if performedexactly
once.
• For servers that require retransmission of replies without re-execution of operations,a history may beused.
• The term „history‟ refers to a structure that contains a record of reply messages that have beentransmuted.
• An entry,in a history contains a request identifier,and an identifier of the client to which it was sent.
• Its purpose is to allow the server and transmit reply messages when client processes request it.
• A problem associated with the history is its memory.
 RPC Exchange Protocols
The following three protocols are used for implementing various types of RPC three protocols producediffering
behaviors in the presence of communication failures.
 The request(R protocol)
It may be used when there is no value to be returned from the procedure and the client requires noconfirmation that
the procedure has been executed.
 The request-reply(RR) protocol .
It is useful for most client-server exchanges.Special acknowledgement messages are notrequired,because a server‟s
reply message is regarded as an acknowledgement.
 The request reply-acknowledgement(RRA)
Protocol it is based on exchange of 3 messages: request,reply and acknowledgement.the
acknowledgement contains the requested which will enable the server to discard entries from its history.

 Use of TCP streams to implement the request-reply protocol


• The desire to avoid implementing multi-packet protocols is one of the reasons for using TCP streamsallowing
arguments and results of anysize to be transmitted.
• If the TCP is used,it ensures that the messages are delivered reliably.There is no need for transmission of
messages and filtering of duplicates or with,histories.
• The overhead due to acknowledgement messages is reduced when a reply message follows soon after amessage
request.
 HTTP:an example of a request-reply protocol
• HTTP(Hyper Text transfer Protocol) is a protocol that specifies the messages involved request-replyexchange,the
method and arguments and results and the rules for representing them in the messages.
• It supports a fixed set of methods(OFT,PUT,POST,etc) that enable to all of its resources.
• In addition to invoking methods on web resources,the protocol allows for content negotiation andpassword-style
authentication.
• Content Negotiation Client‟s requests can include information as to what data representation they canaccept
enabling the server to choose the representation that is most appropriate for the user.
 Authentication
• Password style authentication is provided to prove the identity of the source.
• HTTP is implemented over TCP.
• Each client server interaction consists of the following steps:
 The client requests and the server accept a connection at the default server port or at aport specified in the
URL.
IV YEAR VIIISEM Distributed Computing

 The client sends a request message to the server.


 The server sends a reply message to the client.
 The connection is closed.
• However the need to establish and close a connection for every request-reply exchange is expensive,bothin
overloading the server and in sending too many messages over the network.
• In order to overcome it, a later version of the protocol uses persistent connections-connections thatremain open
over a series of request-reply exchanges between client and server.
• Requests and replies are marshaled into messages as ASCII text strings but resources can be representedas byte
sequence and may be compressed.
• Resources implemented as data are supplied as Multipurpose Internet Mail Extension(MIME) likestructures in
arguments and results.
• MIME is a standard for sending multipart data containing text,images and sounds in e-mail messages.
• Data is prefixed with its MIME type so that the recipient will know to handle it.
 HTTP Methods
• GET -requests the resources whose URL is given as argument.
• HEAD -identical to GET,but it does not return any data.
• However it does return all the information about the data such as the time of last modicification,its type&size.
Post
• It specifies the URL of a resource that can deal with the data supplied with the request.the processingcarried out
on the data depends on the function of the program specified in the URL.
PUT
• Requests that the data supplied in the request is stored with the given URL as its identifier.
 DELETE
• The server deletes the resources identified by the given URL.
• Servers may not always allow this operation,in which case the reply indicates failure.
 OPTIONS
• The server supplies the client with a list of methods it allows to be applied to the given URL.
• TRACE-used for diagnostic purposes.

 Message Contents
• The Request Message specifies the name of a method,the URL of a resource,the protocol version,someheaders
and an optional message body.
• A Reply message specifies the protocol version,a status code and „reason‟,some headers and an optionalmessage
body.

1.9 GROUP COMMUNICATION


• The pair wise exchange of messages is not the best model for communication from one process to agroup of
other processes.
• A multicast operation is more appropriate i.e an operation that sends a single message from one processto
each of the members of a group of process usually in such a way that the membership of the group
istransparent to the sender.
• Multicast messages provide a useful infrastructure for constructing distributed systems has
followingcharacteristics:
• Fault tolerance based on replicated services.
• Finding the discovery servers in spontaneous networking
• Better performance through replicated data
• Propagation of event notifications.
• IP multicast is built on top of the Internet protocol IP.
• A multicast group is specified by class D Internet address.
• The membership is dynamic,allowing computers to join or leave at any time.
• The Java API provides a datagram interface to IP multicast through the class Multicast Socket,which is
asubclass of Datagram Socket with the additional capability to join multicast groups.
• A process can join multicast groups.
• A process can join a multicast group with a given multicast process by invoking the join Group methodof its
Multicast Socket.
• A process can leave a specified group by invoking the leave Group method of its Multicast Socket.
IV YEAR VIIISEM Distributed Computing
• In the program the arguments to the main method specify a message to be multicast and themulticast address
of a group.
• After joining that multicast group,the process makes an instance of DatagramPacket containing themessage
and sends it through its multicast socket to the multicast group.
• After that,it attempts to receive n multicast messages from its peers via its socket,which also belongs tothe
group on the same port.

import java.net.*; import


java.io.*;public class
MulticastPeer{
public static void main(String args[ ])
{

try{
InetAddress
group=InetAddress.getByName(args[1]);MulticastSocket
s=new MulticastSocket(6789);s.joinGroup(group);byte [
] m= args[0].getBytes();
DatagramPacket message =new
DatagramPacket(m,m.length,group,6789);
s.send(message);byte [ ]
buffer=new byte[1000];
for ( int i=0;i<n;i++)
{
No.of members in a group Datagrampacket messagesln=new
DatagramPacket(buffer,buffer.length);
s.leaveGroup(group);
}
Catch(socket exception e) {
System.out.println(“Socket:”+e.getMessage());
}catch(IOException e) {
System.out.println(“IO:”+e.getMessage());}finally{if(s!=null
)s.closef);}

 1.10 CASE STUDY: INTERPROCESS COMMUNICATION IN UNIX


• The IPC primitives in BSD 4.x versions of the UNIX are provided as system calls that are implementedas a
layer over the Internet TCP and UDP protocols.
• Message destinations are specified as socket addresses- a socket address consists of an Internet addressand a
local port number.
The IPC operations are based on the socket.
• Abstractions. Messages are queued at the sending socket until the networking protocol has
transmittedthem,and until an acknowledgement arrives,if the protocol requires one.
• When messages arrive they are queued at the receiving socket until the receiving process makes
anappropriate system call to receive them.
• Any process can create a socket to communicate with another process.
• This is done by invoking socket system call,whose arguments specify the communication domain,thetype
and sometimes a particular protocol.the protocol is particularly selected by the system according towhether
the communication is datagram or stream.
IV YEAR VIIISEM Distributed Computing
 Datagram Communication
• In order to send datagrams a socket pair is identified each time a communication is made.
• This is achieved by sending process using its local socket descriptor and the socket address of thereceiving
socket each time it sends a message.

Fig1.16 illustrates the packets used for datagrams.Client Address and Server Address are socketaddresses.

s=socket(AF_INET,SOCK_DGRAM,0) s=socket(AF_INET,SOCK_DGRAM,0)
bind(s,clientAddress) bind(s.ServerAddress)

sendto(s,”message”,Server Address) amount=recvfrom(s,buffer,from)

Fig.1.16 Sockets used for Datagrams


• Both processes use the socket call to create a socket and get a descriptor for it.
• The first argument of socket specifies the communication domain as the Internet domain and the
secondargument indicates the datagram communication is required.
• The last argument to the socket call may be used to specify a particular protocol,but setting it to zerocauses
the system to select a suitable protocol-UDP in this case.

Both processes use the bind call to bind their sockets to socket addresses.
• The sending process binds its socket address referring to any available local port number.
• The receiving process binds its socket to a socket address that contains its server port and must be
madeknown to the sender.
• The sending process uses sendto call with arguments specifying the socket through the message is to
besent,the message itself and the socket address of the destination.
• The sendto call hands the message to the underlying UDP ans IP protocols and returns the actual numberof
bytes sent.
• As datagram service is requested the messagae is transmitted to its destination without anacknowledgement.
• If the message is too long to be sent,there is an error return.
• The reveiving process uses the recvfrom call with arguments specifying the local socket on which toreceive a
message and memory locations in which to store the message and the socket address of thesending socket.
• The recvfrom call collects the first message in the queue at the socket,or if the queue is empty it will
waituntil a message arrives.
• Communication occurs only when a sendto in one process addresses its message to the socket used by
arecvfrom in another process.
• In client-server communication there is no need for servers to have prior knowledge of client‟s
socketaddresses,because the recvfrom operation supplies the sender‟s address with each message it delivers.
• The properties of datagram communication in UNIX are the same as those described in Section 1.6

 Stream Communication
• In order to use the stream protocol, two processes must first establish a connection between their pair
ofsockets.
• The arrangement is a asymmetric because one of the sockets will be listening for a request for connectionand
the other will be asking for a connection.
• Once a pair of socket has been connected,they may be used for transmitting data in both or eitherdirection.
That is they behave like streams in that any available data is read immediately m the same order as it
waswritten and there is no indication of the boundary of the empty,the sender blocks if it is full.
• Fig1.17 illustrates stream communication in which the details of the arguments are simplified,it does notshow
the server closing the socket on which it listens.
• Normally a server would first listen and accept a connection and then fork a new process to communicatewith
the client.
IV YEAR VIIISEM Distributed Computing
• Mean while,it will continue to listen in the original process.
• The properties of stream communication in UNIX are the same as those described in section 1.6
• Server or listening process first uses the socket operation to create a stream socket and the bind operationto
bind its socket to the server‟s socket address.
• The second argument to the socket system call is given as STOCK_STREAM to indicate that
streamcommunication is required.
• If the third argument is left as zero,the TCP/IP protocol will be selected automatically.
• It uses the listen operation to listen on its socket for client requests for connections.
• The second argument to the listen system call specifies the maximum number of requests forcommunication
that can be queued at this socket.
• The server uses accept system call to accept a connection requested by a client and obtain a new socketfor
communication with that client.
• The original socket may still be used further for connection with other clients.
• The client process uses the socket operation to create a stream socket and then uses the connect systemcall to
request a connection via the socket address of the listening process.
• As the connect call automatically binds a socket name to the caller‟s socket prior binding is unnecessary.

You might also like