Distributed System unit-1
Distributed System unit-1
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
lOMoARcPSD|47077467
UNIT-1
DISTRIBUTED SYSTEMS
Basic Issues
1. What is a Distributed System?
2. Examples of Distributed Systems
3. Advantages and Disadvantages
4. Design Issues with Distributed Systems
5. Preliminary Course Topics
1. Network of workstations
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
1. Transparency
Issue: How to achieve the single system image? How to "fool" everyone into
thinking that the collection of machines is a "simple" computer?
Access transparency - local and remote resources are accessed using identical
operations.
Location transparency - users cannot tell where hardware and software resources
(CPUs, files, data bases) are located; the name of the resource shouldn’t encode
the location of the resource.
Migration (mobility) transparency - resources should be free to move from one
location to another without having their names changed.
Replication transparency - the system is free to make additional copies of files
and other resources (for purpose of performance and/or reliability), without the
users noticing.
Example: several copies of a file; at a certain request that copy is accessed which
is the closest to the client.
Concurrency transparency - the users will not notice the existence of other users
in the system (even if they access the same resources).
Failure transparency - applications should be able to complete their task despite
failures occurring in certain components of the system.
Performance transparency - load variation should not lead to performance
degradation.
This could be achieved by automatic reconfiguration as response to changes of
the load; it is difficult to achieve.
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
2. Communication:
Issue: Components of a distributed system have to communicate in order to
interact. This implies support at two levels:
1. Networking infrastructure (interconnections & network software).
2. Appropriate communication primitives and models and their
implementation:
a. communication primitives:
i. send
ii. receive
iii. remote procedure call (RPC)
b. communication models
i. Client-Server communication: implies a message exchange
between two processes: the process which requests a service
and the one which provides it;
ii. Group Multicast: the target of a message is a set of processes,
which are members of a given group.
3. Performance
Several factors are influencing the performance of a distributed system:
The performance of individual workstations.
The speed of the communication infrastructure.
Extent to which reliability (fault tolerance) is provided (replication and
preservation of coherence imply large overheads).
Flexibility in workload allocation: for example, idle processors
(workstations) could be allocated automatically to a user’s task.
4. Scalability
The system should remain efficient even with a significant increase in the number
of users and resources connected:
Cost of adding resources should be reasonable;
Performance loss with increased number of users and resources should be
controlled;
Software resources should not run out (number of bits allocated to
addresses, number of entries in tables, etc.)
5. Heterogeneity
Distributed applications are typically heterogeneous
Different hardware: mainframes, workstations, PCs, servers, etc.;
Different software: UNIX, MS-Windows, IBM OS/2, Real-time OSs, etc.;
Unconventional devices: teller machines, telephone switches, robots,
manufacturing systems, etc.;
Diverse networks and protocols: Ethernet, FDDI, ATM, TCP/IP, Novell
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Netware, etc.
An additional software layer called middleware used to mask heterogeneity.
6. Openness
One of the important features of distributed systems is openness and flexibility:
Every service is equally accessible to every client (local or remote);
It is easy to implement, install and debug new services;
Users can write and install their own
services. Key aspect of openness:
Standard interfaces and protocols (like Internet communication protocols)
Support of heterogeneity (by adequate middleware, like CORBA)
Availability: If machines go down, the system should work with the reduced
amount of resources.
There should be a very small number of critical resources;
Critical resources: resources which have to be up in order the distributed
system to work.
Data on the system must not be lost, and copies stored redundantly on different
servers must be kept consistent.
The more copies kept, the better the availability, but keeping consistency
becomes more difficult.
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
8. Security
Security of information resources:
Confidentiality: Protection against disclosure to un-authorized person
Integrity: Protection against alteration and corruption
Availability: Keep the resource accessible
Security risks associated with free access because Distributed systems should
allow communication between programs/users/resources on different
computers.
The appropriate use of resources by different users has to be guaranteed.
1. Architectural Models
Issue: How are responsibilities distributed between system components and how
are these components placed?
Client-server model
Peer-to-peer
Variations of the above
two:
Proxy server
Mobile code
Mobile agents
Network computers
Thin clients
Mobile devices
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Peer-to-Peer Model
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
2. Interaction Models
Issue: How do we handle time? Are there time limits on process execution,
message delivery, and clock drifts?
Synchronous distributed systems
Asynchronous distributed systems
Main features:
Lower and upper bounds on execution time of processes can be set.
Transmitted messages are received within a known bounded time.
Drift rates between local clocks have a known bound.
Important consequences:
In a synchronous distributed system there is a notion of global physical
time (with a known relative precision depending on the drift rate).
Only synchronous distributed systems have a predictable behavior in terms
of timing. Only such systems can be used for hard real-time applications.
In a synchronous distributed system it is possible and safe to use timeouts
in order to detect failures of a process or communication link.
It is difficult and costly to implement synchronous distributed systems.
Important consequences:
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
3. Fault Models
Issue: What kind of faults can occur and what are their effects?
Omission faults
Arbitrary faults
Timing faults
Omission Faults
A processor or communication channel fails to perform actions it is supposed to
do.
This means that the particular action is not performed!
We do not have an omission fault if:
o An action is delayed (regardless how long) but finally executed.
o An action is executed with an erroneous result.
With synchronous systems, omission faults can be detected by timeouts.
If we are sure that messages arrive, a timeout will indicate that the sending
process has crashed. Such a system has‘fail-stop’ behavior.
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Timing Faults
Timing faults can occur in synchronous distributed systems, where time limits are
set to process execution, communications, and clock drifts.
A timing fault occurs if any of these time limits is exceeded.
Network Protocol
Middleware and distributed applications have to be implemented on top of a
network protocol. Such a protocol is implemented as several layers.
In case of the Internet:
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
guarantees.
o Sequencing: A sequence number is attached to each transmitted
segment (packet). At the receiver side, no segment is delivered until
all lower numbered segments have been delivered.
o Flow control: The sender takes care not to overwhelm the receiver
(or intermediate nodes). This is based on periodic acknowledgements
received by the sender from the receiver.
o Retransmission and duplicate handling: If a segment is not
acknowledged within a specified timeout, the sender retransmits it.
Based on the sequence number, the receiver is able to detect and
reject duplicates.
o Buffering: Buffering is used to balance the flow between sender and
receiver. If the receiving buffer is full, incoming segments are
dropped. They will not be acknowledged and the sender will
retransmit them.
o Checksum: Each segment carries a checksum. If the received
segment doesn’t match the checksum, it is dropped (and will be
retransmitted)
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Implementation of RMI
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
The proxy is the local representative of the remote object ⇒ the remote
method in B there exists a corresponding method in the proxy.
Implementation is as follows
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Communication module
The communication modules on the client and server are responsible of
carrying out the exchange of messages which implement the
request/reply protocol needed to execute the remote invocation.
The particular messages exchanged and the way errors are handled,
depends on the RMI semantics which is implemented (see slide 40).
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Problems:
Time triggered systems: these are systems in which certain activities are
scheduled to occur at predefined moments in time. If such activities are to
be coordinated over a distributed system we need a coherent notion of
time.
Example: time-triggered real-time systems
Maintaining the consistency of distributed data is often based on the time
when a certain modification has been performed.
Example: a make program.
When the programmer has finished changing some source files he starts make;
make examines the times at which all object and source files were last modified
and decides which source files have to be recompiled.
Solutions:
1. Synchronization of physical clocks
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
2. Logical clocks
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
possible that b → c and at the same time Tb > Tc (Tb is the physical time of b).
7. Logical clocks can be used in order to capture the happened-before relation.
A logical clock is a monotonically increasing software counter.
There is a logical clock CPi at each process Pi in the system.
The value of the logical clock is used to assign timestamps to events. CPi(a)
is the timestamp of event a in process Pi.
There is no relationship between a logical clock and any physical clock.
[R1]:
CPi is incremented before each event is issued at process Pi: CPi := CPi +
1. [R2]:
a) When ‘a’ is the event of sending a message ‘m’ from process Pi, then the
timestamp tm = CPi (a) is included in ‘m’. (CPi(a) is the logical clock value
obtained after applying rule R1).
b) On receiving message ‘m’ by process Pj, its logical clock CPj is updated as
follows: CPj := max(CPj, tm).
c) The new value of CPj is used to timestamp the event of receiving message
‘m’ by Pj (applying rule R1).
If ‘a’ and ‘b’ are events in the same process and ‘a’
occurred before ‘b’, then a→b, and (by R1) C(a) <
C(b).
If ‘a’ is the event of sending a message ‘m’ in a
process, and ‘b’ is the event of the same message
‘m’ being received by another process, then a→b,
and (by R2) C(a) < C(b).
If a → b and b → c, then a → c, and (by induction)
C(a) < C(c).
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
M1 has to be processed before M2 and M3. However P3 has not to wait for
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Vector Clocks
Vector clocks give the ability to decide whether two events are causally related or
not by simply looking at their timestamp.
Each process Pi has a clock CPi, which is an integer vector of length n (n is
the number of processes).
The value of CPi is used to assign timestamps to events in process Pi.
CvPi(a) is the timestamp of event a in process Pi.
CPi[i], the ith entry of CPi, corresponds to Pi’s own logical time.
CPi*j+, j ≠ i, is Pi’s "best guess" of the logical time at Pj.
CPi[j] indicates the (logical) time of occurrence of the last event at Pj which
is in a happened before relation to the current event at Pi.
c) The new value of CPj is used to timestamp the event of receiving message
‘m’ by Pj (applying rule R1).
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Causal Order
Two events a and b are causally related if and only if
o C(a) < C(b) or C(b) < C(a). Otherwise the events are concurrent.
With vector clocks we get the property which we missed for Lamport’s
logical clocks:
o a → b if and only if Cv(a) < Cv(b). Thus, by just looking at the
timestamps of the events, we can say whether two events are
causally related or not.
Basic Idea:
A message is delivered to a process only if the message immediately preceding it
(considering the causal ordering) has been already delivered to the process.
Otherwise, the message is buffered.
We assume that processes communicate using broadcast messages. (There exist
The events which are of interest here are the sending of messages ⇒ vector
similar protocols for non-broadcast communication too.)
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Delayed messages are queued at each process in a queue that is sorted by their
vector timestamp; Concurrent messages are ordered by the time of their arrival.
[R3]:
When a message is delivered at process Pj, its vector clock CPj is updated
according to rule [R1:b] for vector clock implementation.
tm[i] - 1 indicates how many messages originating from Pi precede m.
Step [R2.1] ensures that process Pj has received all the messages
originating from Pi that precede m.
Step [R2.2] ensures that Pj has received all those messages received by Pi
before sending m.
Sending a Message:
Send message M, time stamped tm, along with V_P1 to P2.
Insert (P2, tm) into V_P1. Overwrite the previous value of (P2,t), if any.
(P2,tm) is not sent. Any future message carrying (P2,tm) in V_P1 cannot be
delivered to P2 until tm < tP2.
Delivering a message
If V_M (in the message) does not contain any pair (P2, t), it can be
delivered.
/* (P2, t) exists */ If t ≥ Tp2, buffer the message. (Don’t deliver).
else (t < Tp2) deliver it
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Example:
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Global States
Problem: How to collect and record a consistent global state in a distributed
system.
Why a problem?
Because there is no global clock (no coherent notion of time) and no shared
memory!
Consider a bank system with two accounts A and B at two different sites; we
transfer $50 between A and B.
In general, a global state consists of a set of local states and a set of states
of the communication channels.
The state of the communication channel in a consistent global state should
be the sequence of messages sent along the channel before the sender’s
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
state was recorded, excluding the sequence of messages received along the
global states are very often recorded without using channel states.
Formal Definition
LSi is the local state of process Pi. Beside other information, the local state
also includes a record of all messages sent and received by the process.
We consider the global state GS of a system, as the collection of the local
states of its processes: GS = {LS1, LS2, ..., LSn}.
A certain global state can be consistent or not!
send(Mij) denotes the event of sending message Mij from Pi to Pj;
send(Mij) ∈ LSi if and only if the sending event occurred before the
rec(Mij) denotes the event of receiving message Mij by Pj.
rec(Mij) ∈ LSj if and only if the receiving event occurred before the
local state was recorded;
The global state is seen as a collection of the local states, without explicitly
capturing the state of the channel.
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Example
{LS11, LS22, LS32} is inconsistent;
{LS12, LS23, LS33} is consistent;
{LS11, LS21, LS31} is strongly consistent.
Chandy-Lamport
Algorithm
A process Pi records its local state LSi and later sends a message ‘m’ to Pj;
LSj at Pj has to be recorded before Pj has received m.
The state SChij of the channel Chij consists of all messages that process Pi
sent before recording LSi and which have not been received by Pj when
recording LSj.
A snapshot is started at the request of a particular process Pi, for example,
when it suspects a deadlock because of long delay in accessing a resource;
Pi then records its state LSi and, before sending any other message, it sends
a token to every Pj that Pi communicates with.
When Pj receives a token from Pi, and this is the first time it received a
token, it must record its state before it receives the next message from Pi.
After recording its state Pj sends a token to every process it communicates
with, before sending them any other message.
lOMoARcPSD|47077467
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
Algorithm
Rule for sender Pi:
/* performed by the initiating process and by any other process at the reception
of the first token */
[SR1]:
Pi records its state.
[SR2]:
Pi sends a token on each of its outgoing channels.
B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)