DC Unit 1
DC Unit 1
UNIT I INTRODUCTION
1.INTRODUCTION
The process of computation was started from working on a single processor. This uni-
processor computing can be termed as centralized computing. As the demand for the
increased processing capability grew high, multiprocessor systems came to existence. The
advent of multiprocessor systems, led to the development of distributed systems with high
degree of scalability and resource sharing. The modern day parallel computing is a subset of
distributed computing
Autonomy and heterogeneity – Here the processors are “loosely coupled” in that they have
different speeds and each can be running a different operating system.
QOS parameters
The distributed systems must offer the following QOS:
Performance
Reliability
Availability
Security
As shown in Fig 1.1, Each computer has a memory-processing unit and the computers are
connected by a communication network. Each system connected to the distributed networks
hosts distributed software which is a middleware technology. This drives the Distributed
System (DS) at the same time preserves the heterogeneity of the DS. The term computation
or run in a distributed system is the execution of processes to achieve a common goal.
The interaction of the layers of the network with the operating system and
middleware is shown in Fig 1.2. The middleware contains important library functions for
facilitating the operations of DS.
The distributed system uses a layered architecture to break down the complexity of system
design. The middleware is the distributed software that drives the distributed system, while
providing transparency of heterogeneity at the platform level
3. Motivation
The following are the keypoints that acts as a driving force behind DS:
Communication among processors takes place via shared data variables, and
control variables for synchronization among the processors. The communications
between the tasks in multiprocessor systems take place through two main modes:
Buffered: The standard option copies the data from the user buffer to the kernel
buffer. The data later gets copied from the kernel buffer onto the network. For the
Receive primitive, the buffered option is usually required because the data may
already have arrived when the primitive is invoked, and needs a storage place in
the kernel.
Unbuffered: The data gets copied directly from the user buffer onto the network.
Synchronous
A Send or a Receive primitive is synchronous if both the Send() and Receive()
handshake with each other.
The processing for the Send primitive completes only after the invoking
processor learns that the other corresponding Receive primitive has also been
invoked and that the receive operation has been completed.
The processing for the Receive primitive completes when the data to be
received is copied into the receiver’s user buffer.
Asynchronous
A Send primitive is said to be asynchronous, if control returns back to the
invoking process after the data item to be sent has been copied out of the user-
specified buffer.
It does not make sense to define asynchronous Receive primitives.
Blocking primitives
The primitive commands wait for the message to be delivered. The execution of
the processes is blocked.
The sending process must wait after a send until an acknowledgement is made
by the receiver.
The receiving process must wait for the expected message from the sending
process
The receipt is determined by polling common buffer or interrupt
This is a form of synchronization or synchronous communication.
A primitive is blocking if control returns to the invoking process after the
processing for the primitive completes.
... ...
... ...
Figure 1.7: A nonblocking send primitive. When the Wait call returns, at least one of
its parameters is posted.
Blocking synchronous
Non- blocking synchronous
Blocking asynchronous
Non- blocking asynchronous
The asynchronous Send completes when the data has been copied out of the user’s
Fig a) Blocking synchronous send and blocking Fig b) Non-blocking synchronous send and
receive blocking receive
buffer. The checking for the completion may be necessary if the user wants to reuse the
buffer from which the data was sent.
Non-blocking Receive:
The Receive call will cause the kernel to register the call and return the handle
of a location that the user process can later check for the completion of the
non-blocking Receive operation.
This location gets posted by the kernel after the expected data arrives and is
copied to the user-specified buffer. The user process can check for the
completion of the non-blocking Receive by invoking the Wait operation on the
returned handle.
Processor Synchrony
Processor synchrony indicates that all the processors execute in lock-step with their
clocks synchronized.
Since distributed systems do not follow a common clock, this abstraction is implemented
using some form of barrier synchronization to ensure that no processor begins executing the
next step of code until all the processors have completed executing the previous steps of
code assigned to each of the processors.
RMI RPC
RMI uses an object oriented paradigm RPC is not object oriented and does not
where the user needs to know the object deal with objects. Rather, it calls specific
and the method of the object he needs to subroutines that are already established
invoke.
RMI handles the complexities of With RPC looks like a local call. RPC
passing along the invocation from the handles the complexities involved with
local to the remote computer. But passing the call from the local to the
instead of passing a procedural call, remote computer.
RMI passes a reference to the object
and the method that is being
called.
Asynchronous Execution:
A communication among processes is considered asynchronous, when every
communicating process can have a different observation of the order of the messages being
exchanged. In an asynchronous execution:
there is no processor synchrony and there is no bound on the drift rate of
processor clocks
message delays are finite but unbounded
no upper bound on the time taken by a process
If system A can be emulated by system B, denoted A/B, and if a problem is not solvable in
B, then it is also not solvable in A. If a problem is solvable in A, it is also solvable in B.
Hence, in a sense, all four classes are equivalent in terms of computability in failure-free
systems.
The design of distributed systems has numerous challenges. They can be categorized
into:
Issues related to system and operating systems design
Issues related to algorithm design
Issues arising due to emerging technologies
The above three classes are not mutually exclusive.
Issues related to system and operating systems design
The interleaving model, partial order model, input/output automata model and the Temporal
Logic of Actions (TLA) are some examples of models that provide different degrees of
infrastructure.
Dynamic distributed graph algorithms and distributed routing algorithms
The distributed system is generally modeled as a distributed graph.
Hence graph algorithms are the base for large number of higher level
communication, data dissemination, object location, and object search functions.
These algorithms must have the capacity to deal with highly dynamic graph
characteristics. They are expected to function like routing algorithms.
The performance of these algorithms has direct impact on user-perceived latency, data
traffic and load in the network.
Time and global state in a distributed system
The geographically remote resources demands the synchronization based on logical
time.
Logical time is relative and eliminates the overheads of providing physical time for
applications .Logical time can
(i) capturethe logic and inter-process dependencies
(ii) track the relative progress at each process
Maintaining the global state of the system across space involves the role of time
dimension for consistency. This can be done with extra effort in a coordinated manner.
Deriving appropriate measures of concurrency also involves the time dimension, as
the execution and communication speed of threads may vary a lot.
Synchronization/coordination mechanisms
Synchronization is essential for the distributed processes to facilitate concurrent
execution without affecting other processes.
The synchronization mechanisms also involve resource management and
concurrency management mechanisms.
Some techniques for providing synchronization are:
Physical clock synchronization: Physical clocks usually diverge in theirvalues due
to hardware limitations. Keeping them synchronized is a fundamental challenge to
maintain common time.
Leader election: All the processes need to agree on which process willplay the
role of a distinguished process or a leader process. A leaderis necessary even for many
distributed algorithms because there is often some asymmetry.
Mutual exclusion:Access to the critical resource(s) has to be coordinated.
Deadlock detection and resolution:This is done to avoid duplicate work, and
deadlock resolution shouldbe coordinated to avoid unnecessary aborts of processes.
Termination detection: cooperation among the processesto detect the specific global
state of quiescence.
Garbage collection: Detecting garbage requirescoordination among the processes.
Group communication, multicast, and ordered message delivery
A group is a collection of processes that share a common context and collaborate on a
common task within an application domain. Group management protocols are needed for
group communication wherein processes can join and leave groups dynamically, or fail.
The concurrent execution of remote processes may sometimes violate the semantics
and order of the distributed program. Hence, a formal specification of the semantics
ofordered delivery need to be formulated, and then implemented.
Monitoring distributed events and predicates
Predicates defined on program variables that are local to different processes are used
for specifying conditions on the global system state.
On-line algorithms for monitoring such predicates are hence important.
An important paradigm for monitoring distributed events is that of event streaming,
wherein streams of relevant events reported from different processes are examined
collectively to detect predicates.
The specification of such predicates uses physical or logical time relationships.
Distributed program design and verification tools
Methodically designed and verifiably correct programs can greatly reduce the overhead of
software design, debugging, and engineering. Designing these is a big challenge.
Debugging distributed programs
Debugging distributed programs is much harder because of the concurrency and replications.
Adequate debugging mechanisms and tools are need to be designed to meet this challenges.
Data replication, consistency models, and caching
Fast access to data and other resources is important in distributed systems.
Managing replicas and their updates faces concurrency problems.
Placement of the replicas in the systems is also a challenge because resources
usually cannot be freely replicated.
World Wide Web design – caching, searching, scheduling
WWW is a commonly known distributed system.
The issues of object replication and caching, prefetching of objects have to be done on
WWW also.
Object search and navigation on the web are important functions in the operation
of the web.
Distributed shared memory abstraction
A shared memory is easier to implement since it does not involve managing the
communication tasks.
The communication is done by the middleware by message passing.
The overhead of shared memory is to be dealt by the middleware technology.
Some of the methodologies that does the task of communication in shared memory
distributed systems are:
Wait-free algorithms: The ability of a process to complete its execution irrespective
of theactions of other processes is wait free algorithm. They control the access to shared
resources in the shared memory abstraction. They are expensive.
Mutual exclusion: Concurrent access of processes to a shared resource or data is
executed in mutually exclusive manner. Only one process is allowed to execute the critical
section at any given time. In a distributed system, shared variables or a local kernel cannot
be used to implement mutual exclusion. Message passing is the sole means for implementing
distributed mutual exclusion.
Register constructions:Architectures must be designed in such a way that, registers
allows concurrent access without any restrictions on the concurrency permitted.
Reliable and fault-tolerant distributed systems
Real-time scheduling becomes more challenging when a global view of the system state is
absent with more frequent on-line or dynamic changes. The message propagation delays
which are network-dependentare hard to control or predict. This is an hindrance to meet the
QoS requirements of the network.
Performance
User perceived latency in distributed systems must be reduced. The common issues in
performance:
Metrics: Appropriate metrics must be defined for measuring the performance of
theoretical distributed algorithms and its implementation.
Measurement methods/tools: The distributed system is a complex entity
appropriate methodology and tools must be developed for measuring the performance
metrics.
Sensor networks
o A sensor is a processor with an electro-mechanical interface that is capable
ofsensing physical parameters.
o They are low cost equipment with limited computational power and battery
life. They are designed to handle streaming data and route it to external
computer network and processes.
o They are susceptible to faults and have to reconfigure themselves.
DISTRIBUTED PROGRAM
Let ei x denote the xth event at process pi . For a message m, let send(m) and rec(m) denote
its send and receive events, respectively.
The occurrence of events changes the states of respective processes and channels, thus
causing transitions in the global system state.
An internal event changes the state of the process at which it occurs.
A send event changes the state of the process that sends the message and the state of
the channel on which the message is sent.
The execution of process pi produces a sequence of events e1, e2, e3, …, and it is
denoted by Hi: Hi =(hii). Here hi is set of events produced by pi and are the casual
dependencies among events pi.
msgindicates the dependency that exists due to message passing between two events.
When all the above conditions are satisfied, then it can be concluded that ab is casually
related. Consider two events c and d; cd and dc is false (i.e) they are not casually
related, then c and d are said to be concurrent events denoted as c||d.
This property ensures that causally related messages destined to the same destination are delivered in an
order that is consistent with their causality relation. Causally ordered delivery of messages implies FIFO
message delivery. Causal ordering model considerably simplifies the design of distributed algorithms
because it provides a built-in synchronization.
GLOBAL STATE
The global state of a distributed system is a collection of the local states of its
components, namely, the processes and the communication channels.
The state of a process at any time is defined by the contents of processor registers,
stacks, local memory, etc. and depends on the local context of the distributed
application.
The state of a channel is given by the set of messages in transit in the channel.
The occurrence of events changes the states of respective processes and channels.
Example: An internal event changes the state of the process at which it occurs. A send event
changes the state of the process that sends the message and the state of the channel on which the
message is sent. A receive event changes the state of the process that or receives the message and
the state of the channel on which the message is received.
Notations:
1. LSix denotes the state of process pi after the occurrence of event eix and before the event ei x+1 .
2. LSi0 denotes the initial state of process pi .
3. LSix is a result of the execution of all the events executed by process pi till eix .
4. Let send(m) ≤ LSix denote the fact that ∃y:1≤y≤x :: eiy =send(m).
5. Let rec(m) ≤/ LSix denote the fact that ∀y:1≤y≤x :: eiy = rec(m).
Channel State:
• A Channel State:
The state of a channel depends upon the states of the processes it connects.
Let SCi j x,y denote the state of a channel Cij . The state of a channel is defined as follows:
Thus, channel state SCi j x,y denotes all messages that pi sent upto event eix and which process pj had
not received until event ejy
The global state of a distributed system is a collection of the local states of the processes and the channels.
Notationally, global state GS is defined as,
For a global state to be meaningful, the states of all the components of the distributed system must be recorded
at the same instant.
In Figure A global state GS1 = { LS11 , LS23 , LS33 , LS42 } is inconsistent because the state of p2 has recorded the
receipt of message m12, however, the state of p1 has not recorded its send. A global state GS2 consisting of local states
{ LS12 , LS24 , LS34 , LS42 } is consistent; all the channels are empty except C21 that contains message m21.