CS3551 Unit 1 Notes
CS3551 Unit 1 Notes
UNIT I INTRODUCTION
1.INTRODUCTION
The process of computation was started from working on a single processor. This uni-
processor computing can be termed as centralized computing. As the demand for the
increased processing capability grew high, multiprocessor systems came to existence. The
advent of multiprocessor systems, led to the development of distributed systems with high
degree of scalability and resource sharing. The modern day parallel computing is a subset of
distributed computing
QOS parameters
The distributed systems must offer the following QOS:
Performance
Reliability
Availability
Security
The interaction of the layers of the network with the operating system and
middleware is shown in Fig 1.2. The middleware contains important library functions for
facilitating the operations of DS.
The distributed system uses a layered architecture to break down the complexity of system
design. The middleware is the distributed software that drives the distributed system, while
providing transparency of heterogeneity at the platform level
3. Motivation
The following are the keypoints that acts as a driving force behind DS:
Communication among processors takes place via shared data variables, and
control variables for synchronization among the processors. The communications
between the tasks in multiprocessor systems take place through two main modes:
Buffered: The standard option copies the data from the user buffer to the kernel
buffer. The data later gets copied from the kernel buffer onto the network. For the
Receive primitive, the buffered option is usually required because the data may
already have arrived when the primitive is invoked, and needs a storage place in
the kernel.
Unbuffered: The data gets copied directly from the user buffer onto the network.
Synchronous
A Send or a Receive primitive is synchronous if both the Send() and Receive()
handshake with each other.
The processing for the Send primitive completes only after the invoking
processor learns that the other corresponding Receive primitive has also been
invoked and that the receive operation has been completed.
The processing for the Receive primitive completes when the data to be
received is copied into the receiver’s user buffer.
Asynchronous
A Send primitive is said to be asynchronous, if control returns back to the
invoking process after the data item to be sent has been copied out of the user-
specified buffer.
It does not make sense to define asynchronous Receive primitives.
Blocking primitives
The primitive commands wait for the message to be delivered. The execution of
the processes is blocked.
The sending process must wait after a send until an acknowledgement is made
by the receiver.
The receiving process must wait for the expected message from the sending
process
The receipt is determined by polling common buffer or interrupt
This is a form of synchronization or synchronous communication.
A primitive is blocking if control returns to the invoking process after the
processing for the primitive completes.
... ...
... ...
Figure 1.7: A nonblocking send primitive. When the Wait call returns, at least one of
its parameters is posted.
The asynchronous Send completes when the data has been copied out of the user’s
Fig a) Blocking synchronous send and blocking Fig b) Non-blocking synchronous send and
receive blocking receive
buffer. The checking for the completion may be necessary if the user wants to reuse the
buffer from which the data was sent.
Non-blocking Receive:
The Receive call will cause the kernel to register the call and return the handle
of a location that the user process can later check for the completion of the
non-blocking Receive operation.
This location gets posted by the kernel after the expected data arrives and is
copied to the user-specified buffer. The user process can check for the
completion of the non-blocking Receive by invoking the Wait operation on the
returned handle.
Processor Synchrony
Processor synchrony indicates that all the processors execute in lock-step with their clocks
synchronized.
Since distributed systems do not follow a common clock, this abstraction is implemented
using some form of barrier synchronization to ensure that no processor begins executing the
next step of code until all the processors have completed executing the previous steps of
code assigned to each of the processors.
RMI RPC
RMI uses an object oriented paradigm RPC is not object oriented and does not
where the user needs to know the object deal with objects. Rather, it calls specific
and the method of the object he needs to subroutines that are already established
invoke.
RMI handles the complexities of With RPC looks like a local call. RPC
passing along the invocation from the handles the complexities involved with
local to the remote computer. But passing the call from the local to the
instead of passing a procedural call, remote computer.
RMI passes a reference to the object
and the method that is being
called.
Asynchronous Execution:
A communication among processes is considered asynchronous, when every
communicating process can have a different observation of the order of the messages being
exchanged. In an asynchronous execution:
there is no processor synchrony and there is no bound on the drift rate of
processor clocks
message delays are finite but unbounded
no upper bound on the time taken by a process
If system A can be emulated by system B, denoted A/B, and if a problem is not solvable in
B, then it is also not solvable in A. If a problem is solvable in A, it is also solvable in B.
Hence, in a sense, all four classes are equivalent in terms of computability in failure-free
systems.
The design of distributed systems has numerous challenges. They can be categorized
into:
Issues related to system and operating systems design
Issues related to algorithm design
Issues arising due to emerging technologies
The above three classes are not mutually exclusive.
Issues related to system and operating systems design
The following are some of the common challenges to be addressed in designing a
distributed system from system perspective:
Communication: This task involves designing suitable communication mechanisms
among the various processes in the networks.
Examples: RPC, RMI
Processes: The main challenges involved are: process and thread management at both
client and server environments, migration of code between systems, design of software and
mobile agents.
Naming: Devising easy to use and robust schemes for names, identifiers, and
addresses is essential for locating resources and processes in a transparent and scalable
manner. The remote and highly varied geographical locations make this task difficult.
Synchronization: Mutual exclusion, leader election, deploying physical clocks,
global state recording are some synchronization mechanisms.
Data storage and access Schemes: Designing file systems for easy and efficient data
storage with implicit accessing mechanism is very much essential for distributed operation
Consistency and replication: The notion of Distributed systems goes hand in hand
with replication of data, to provide high degree of scalability. The replicas should be handled
with care since data consistency is prime issue.
Fault tolerance: This requires maintenance of failure of links, nodes, and processes.
Some of the common fault tolerant techniques are resilience, reliable communication,
distributed commit, check pointing and recovery, agreement and consensus, failure
detection, and self-stabilization.
Security: Cryptography, secure channels, access control, key management –
generation and distribution, authorization, and secure group management are some of the
security measure that is imposed on distributed systems.
Applications Programming Interface (API) and transparency:The user
friendliness and ease of use is very important to make the distributed services to be used by
wide community. Transparency, which is hiding inner implementation policy from users, is
of the following types:
Access transparency: hides differences in data representation
Location transparency: hides differences in locations y providing uniform access to
data located at remote locations.
Migration transparency: allows relocating resources without changing names.
Replication transparency: Makes the user unaware whether he is working on
original or replicated data.
Concurrency transparency: Masks the concurrent use of shared resources for the
user.
Failure transparency: system being reliable and fault-tolerant.
Scalability and modularity: The algorithms, data and services must be as distributed
as possible. Various techniques such as replication, caching and cache management, and
asynchronous processing help to achieve scalability.
Real-time scheduling becomes more challenging when a global view of the system state is
absent with more frequent on-line or dynamic changes. The message propagation delays
which are network-dependentare hard to control or predict. This is an hindrance to meet the
QoS requirements of the network.
Performance
User perceived latency in distributed systems must be reduced. The common issues in
performance:
Metrics: Appropriate metrics must be defined for measuring the performance of
theoretical distributed algorithms and its implementation.
Measurement methods/tools: The distributed system is a complex entity
appropriate methodology and tools must be developed for measuring the performance
metrics.
Sensor networks
o A sensor is a processor with an electro-mechanical interface that is capable
ofsensing physical parameters.
o They are low cost equipment with limited computational power and battery
life. They are designed to handle streaming data and route it to external
computer network and processes.
o They are susceptible to faults and have to reconfigure themselves.
o These features introduces a whole newset of challenges, such as position
estimation and time estimation when designing a distributed system .
Ubiquitous or pervasive computing
o In Ubiquitous systems the processors are embedded in the environment to
performapplication functions in the background.
o Examples: Intelligent devices, smart homes etc.
o They are distributed systems with recent advancements operating in wireless
environments through actuator mechanisms.
o They can be self-organizing and network-centricwith limited resources.
Peer-to-peer computing
o Peer-to-peer (P2P) computing is computing over an application layernetwork
where all interactions among the processors are at a same level.
o This is a form of symmetric computation against the client sever paradigm.
o They are self-organizing with or without regular structure to the network.
o Some of the key challenges include: object storage mechanisms, efficient
object lookup, and retrieval in a scalable manner; dynamic reconfiguration
with nodes as well as objects joining and leaving the networkrandomly;
replication strategies to expedite object search; tradeoffs betweenobject size
latency and table sizes; anonymity, privacy, and security.
Publish-subscribe, content distribution, and multimedia
o The users in present day require only the information of interest.
o In a dynamic environment where the informationconstantly fluctuates there is
great demand for
o Publish:an efficient mechanism for distributing this information
o Subscribe: an efficient mechanism to allow end users to indicate interest in
receiving specific kinds of information
o An efficient mechanism foraggregating large volumes of published
information and filtering it as per theuser’s subscription filter.
o Content distribution refers to a mechanism that categorizes the information
based on parameters.
o The publish subscribe and content distribution overlap each other.
o Multimedia data introduces special issue because of its large size.
Distributed agents
o Agents are software processes or sometimes robots that move around the
system to do specific tasks for which they are programmed.
o Agents collect and process information and can exchangesuch information
with other agents.
o Challenges in distributed agent systems include coordination mechanisms
among the agents, controlling the mobility of the agents,their software design
and interfaces.
Distributed data mining
o Data mining algorithms process large amount of data to detect patterns and
trends in the data, to mine or extract useful information.
o The mining can be done by applying database and artificial intelligence
techniques to a data repository.
Grid computing
Grid computing is deployed to manage resources. For instance, idle CPU
cycles of machines connected to the network will be available to others.
The challenges includes: scheduling jobs, framework for implementing quality
of service, real-time guarantees, security.
Security in distributed systems
The challenges of security in a distributed setting include: confidentiality,
authentication and availability. This can be addressed using efficient and scalable solutions.
DISTRIBUTED PROGRAM
Let ei x denote the xth event at process pi . For a message m, let send(m) and rec(m) denote
its send and receive events, respectively.
The occurrence of events changes the states of respective processes and channels, thus
causing transitions in the global system state.
An internal event changes the state of the process at which it occurs.
A send event changes the state of the process that sends the message and the state of
the channel on which the message is sent.
The execution of process pi produces a sequence of events e1, e2, e3, …, and it is
denoted by Hi: Hi =(hii). Here hi is set of events produced by p i and are the casual
dependencies among events pi.
msgindicates the dependency that exists due to message passing between two events.
When all the above conditions are satisfied, then it can be concluded that ab is casually
related. Consider two events c and d; cd and dc is false (i.e) they are not casually related,
then c and d are said to be concurrent events denoted as c||d.
This property ensures that causally related messages destined to the same destination are delivered in an
order that is consistent with their causality relation. Causally ordered delivery of messages implies FIFO
message delivery. Causal ordering model considerably simplifies the design of distributed algorithms
because it provides a built-in synchronization.
The relation between the three models is given by CO FIFO N-FIFO.
GLOBAL STATE
The global state of a distributed system is a collection of the local states of its
components, namely, the processes and the communication channels.
The state of a process at any time is defined by the contents of processor registers,
stacks, local memory, etc. and depends on the local context of the distributed
application.
The state of a channel is given by the set of messages in transit in the channel.
The occurrence of events changes the states of respective processes and channels.
Example: An internal event changes the state of the process at which it occurs. A send event
changes the state of the process that sends the message and the state of the channel on which the
message is sent. A receive event changes the state of the process that or receives the message and
the state of the channel on which the message is received.
Notations:
1. LSix denotes the state of process pi after the occurrence of event eix and before the event ei x+1 .
2. LSi0 denotes the initial state of process pi .
4. Let send(m) ≤ LSix denote the fact that ∃y:1≤y≤x :: eiy =send(m).
3. LSix is a result of the execution of all the events executed by process pi till eix .
5. Let rec(m) ≤/ LSix denote the fact that ∀y:1≤y≤x :: eiy = rec(m).
Channel State:
• A Channel State:
The state of a channel depends upon the states of the processes it connects.
Let SCi j x,y denote the state of a channel Cij . The state of a channel is defined as follows:
Thus, channel state SCi j x,y denotes all messages that pi sent upto event eix and which process pj had
not received until event ejy
The global state of a distributed system is a collection of the local states of the processes and the channels.
Notationally, global state GS is defined as,
For a global state to be meaningful, the states of all the components of the distributed system must be recorded
at the same instant.
In Figure A global state GS1 = { LS11 , LS23 , LS33 , LS42 } is inconsistent because the state of p2 has recorded the
receipt of message m12, however, the state of p1 has not recorded its send. A global state GS2 consisting of local states
{ LS12 , LS24 , LS34 , LS42 } is consistent; all the channels are empty except C21 that contains message m21.