0% found this document useful (0 votes)
14 views

Distributed System unit-1

Unit notes

Uploaded by

nida
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Distributed System unit-1

Unit notes

Uploaded by

nida
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)
lOMoARcPSD|47077467

UNIT-1
DISTRIBUTED SYSTEMS
Basic Issues
1. What is a Distributed System?
2. Examples of Distributed Systems
3. Advantages and Disadvantages
4. Design Issues with Distributed Systems
5. Preliminary Course Topics

What is a Distributed System?


A distributed system is a collection of autonomous computers linked by a
computer network that appear to the users of the system as a single computer.
Some comments:
 System architecture: the machines are autonomous; this means they are
computers which, in principle, could work independently;
 The user’s perception: the distributed system is perceived as a single
system solving a certain problem (even though, in reality, we have several
computers placed in different locations).

By running distributed system software the computers are enabled to:


1. Coordinate their activities
2. Share resources: hardware, software, data.

Examples of Distributed Systems

1. Network of workstations

 Personal workstations + processors


not assigned to specific users.
 Single file system, with all files
accessible from all machines in the
same way and using the same path
name.
 For a certain command the system
can look for the best place
(workstation) to execute it.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

2. Automatic banking (teller machine) system

 Primary requirements: security and


reliability.
 Consistency of replicated data.
 Concurrent transactions (operations
which involve accounts in different
banks; simultaneous access from
several users, etc).
 Fault tolerance

Some more examples of Distributed Real-Time Systems

 Synchronization of physical clocks


 Scheduling with hard time constraints
 Real-time communication
 Fault tolerance

Advantages of Distributed System


Performance: very often a collection of processors can provide higher
performance (and better price/performance ratio) than a centralized computer.
Distribution: many applications involve, by their nature, spatially separated
machines (banking, commercial, automotive system).
Reliability (fault tolerance): if some of the machines crash, the system can
survive.
Incremental growth: as requirements on processing power grow, new machines
can be added incrementally.
Sharing of data/resources: shared data is essential to many applications (banking,
computer supported cooperative work, reservation systems); other resources can
be also shared (e.g. expensive printers).

Disadvantage of Distributed System

Difficulties of developing distributed software: how should operating systems,


programming languages and applications look like?
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Networking problems: several problems are created by the network


infrastructure, which have to be dealt with: loss of messages, overloading, ...
Security problems: sharing generates the problem of data security.

Design Issues with Distributed Systems


Design issues that arise specifically from the distributed nature of the application:
 Transparency
 Communication
 Performance
 Scalability
 Heterogeneity
 Openness
 Reliability & fault tolerance
 Security

1. Transparency

Issue: How to achieve the single system image? How to "fool" everyone into
thinking that the collection of machines is a "simple" computer?

Access transparency - local and remote resources are accessed using identical
operations.
Location transparency - users cannot tell where hardware and software resources
(CPUs, files, data bases) are located; the name of the resource shouldn’t encode
the location of the resource.
Migration (mobility) transparency - resources should be free to move from one
location to another without having their names changed.
Replication transparency - the system is free to make additional copies of files
and other resources (for purpose of performance and/or reliability), without the
users noticing.
Example: several copies of a file; at a certain request that copy is accessed which
is the closest to the client.
Concurrency transparency - the users will not notice the existence of other users
in the system (even if they access the same resources).
Failure transparency - applications should be able to complete their task despite
failures occurring in certain components of the system.
Performance transparency - load variation should not lead to performance
degradation.
This could be achieved by automatic reconfiguration as response to changes of
the load; it is difficult to achieve.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

2. Communication:
Issue: Components of a distributed system have to communicate in order to
interact. This implies support at two levels:
1. Networking infrastructure (interconnections & network software).
2. Appropriate communication primitives and models and their
implementation:
a. communication primitives:
i. send
ii. receive
iii. remote procedure call (RPC)
b. communication models
i. Client-Server communication: implies a message exchange
between two processes: the process which requests a service
and the one which provides it;
ii. Group Multicast: the target of a message is a set of processes,
which are members of a given group.

3. Performance
Several factors are influencing the performance of a distributed system:
 The performance of individual workstations.
 The speed of the communication infrastructure.
 Extent to which reliability (fault tolerance) is provided (replication and
preservation of coherence imply large overheads).
 Flexibility in workload allocation: for example, idle processors
(workstations) could be allocated automatically to a user’s task.

4. Scalability
The system should remain efficient even with a significant increase in the number
of users and resources connected:
 Cost of adding resources should be reasonable;
 Performance loss with increased number of users and resources should be
controlled;
 Software resources should not run out (number of bits allocated to
addresses, number of entries in tables, etc.)

5. Heterogeneity
Distributed applications are typically heterogeneous
 Different hardware: mainframes, workstations, PCs, servers, etc.;
 Different software: UNIX, MS-Windows, IBM OS/2, Real-time OSs, etc.;
 Unconventional devices: teller machines, telephone switches, robots,
manufacturing systems, etc.;
 Diverse networks and protocols: Ethernet, FDDI, ATM, TCP/IP, Novell
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Netware, etc.
An additional software layer called middleware used to mask heterogeneity.

6. Openness
One of the important features of distributed systems is openness and flexibility:
 Every service is equally accessible to every client (local or remote);
 It is easy to implement, install and debug new services;
 Users can write and install their own
services. Key aspect of openness:
 Standard interfaces and protocols (like Internet communication protocols)
 Support of heterogeneity (by adequate middleware, like CORBA)

7. Reliability and Fault Tolerance


One of the main goals of building distributed systems is improvement of
reliability.

Availability: If machines go down, the system should work with the reduced
amount of resources.
 There should be a very small number of critical resources;
 Critical resources: resources which have to be up in order the distributed
system to work.

replicated i.e. if one of them fails another one takes up - redundancy.


 Key pieces of hardware and software (critical resources) should be

Data on the system must not be lost, and copies stored redundantly on different
servers must be kept consistent.
 The more copies kept, the better the availability, but keeping consistency
becomes more difficult.

Fault-tolerance is a main issue related to reliability: the system has to detect


faults and act in a reasonable way:
 Mask the fault: continue to work with possibly reduced performance but
without loss of data/ information.
 Fail gracefully: react to the fault in a predictable way and possibly stop
functionality for a short period, but without loss of data/information.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

8. Security
Security of information resources:
 Confidentiality: Protection against disclosure to un-authorized person
 Integrity: Protection against alteration and corruption
 Availability: Keep the resource accessible

Security risks associated with free access because Distributed systems should
allow communication between programs/users/resources on different
computers.
The appropriate use of resources by different users has to be guaranteed.

MODELS OF DISTRIBUTED SYSTEMS


Basic Element:
 Resources in a distributed system are shared between users. They are
normally encapsulated within one of the computers and can be accessed
from other computers by communication.
 Each resource is managed by a program, the resource manager; it offers a
communication interface enabling the resource to be accessed by its users.
 Resource managers can be in general modeled as processes.
If the system is designed according to an object oriented methodology, resources
are encapsulated in objects.

1. Architectural Models
Issue: How are responsibilities distributed between system components and how
are these components placed?
 Client-server model
 Peer-to-peer
Variations of the above
two:
 Proxy server
 Mobile code
 Mobile agents
 Network computers
 Thin clients
 Mobile devices
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Client -Server Model


The system is structured as a set of processes, called servers that offer services to
the users, called clients.

The client-server model is usually based on a simple request/reply protocol,


implemented with send/receive primitives or using remote procedure calls (RPC)
or remote method invocation (RMI):
 The client sends a request
(invocation) message to the server
asking for some service;
 The server does the work and returns
a result (e.g. the data requested) or
an error code if the work could not be
performed.
 A server can itself request services
from other servers; thus, in this new
relation, the server itself acts like a
client.

 Centralization of service ⇒ poor scaling


Some problems with client-server:

 Limitations: capacity of server, bandwidth of network connecting the


server.

Peer-to-Peer Model

 All processes (objects) play similar role.


 Processes (objects) interact without particular distinction between clients and
servers.
 The pattern of communication depends on the particular application.
 A large number of data objects are shared; any
individual computer holds only a small part of
the application database.
 Processing and communication loads for access
to objects are distributed across many
computers and access links.
 This is the most general and flexible model.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Peer-to-Peer tries to solve the problems in centralized system.


 It distributes shared resources widely share computing and communication
loads.
Problems with peer-to-peer:
High complexity due to
 Cleverly place individual objects
 Retrieve the objects
 Maintain potentially large number of replicas.

2. Interaction Models

Issue: How do we handle time? Are there time limits on process execution,
message delivery, and clock drifts?
 Synchronous distributed systems
 Asynchronous distributed systems

Synchronous Distributed Systems

Main features:
 Lower and upper bounds on execution time of processes can be set.
 Transmitted messages are received within a known bounded time.
 Drift rates between local clocks have a known bound.

Important consequences:
 In a synchronous distributed system there is a notion of global physical
time (with a known relative precision depending on the drift rate).
 Only synchronous distributed systems have a predictable behavior in terms
of timing. Only such systems can be used for hard real-time applications.
 In a synchronous distributed system it is possible and safe to use timeouts
in order to detect failures of a process or communication link.
It is difficult and costly to implement synchronous distributed systems.

Asynchronous Distributed Systems


Many distributed systems (including those on the Internet) are asynchronous.
 No bound on process execution time (nothing can be assumed about
speed, load, and reliability of computers).
 No bound on message transmission delays (nothing can be assumed about
speed, load, reliability of interconnections)
 No bounds on drift rates between local clocks.

Important consequences:
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

1. In an asynchronous distributed system there is no global physical time.


Reasoning can be only in terms of logical time (see lecture on time and
state).
2. Asynchronous distributed systems are unpredictable in terms of timing.
3. No timeouts can be used.
Asynchronous systems are widely and successfully used in practice.
In practice timeouts are used with asynchronous systems for failure detection.
However, additional measures have to be applied in order to avoid duplicated
messages, duplicated execution of operations, etc.

3. Fault Models
Issue: What kind of faults can occur and what are their effects?
 Omission faults
 Arbitrary faults
 Timing faults

1. Faults can occur both in processes and communication channels. The


reason can be both software and hardware faults.
2. Fault models are needed in order to build systems with predictable
behavior in case of faults (systems which are fault tolerant).
3. Of course, such a system will function according to the predictions, only as
long as the real faults behave as defined by the “fault model”. If not .......
4. These issues will be discussed in some of the following chapters and in
particular in the chapter on “Recovery and Fault Tolerance”.

Omission Faults
A processor or communication channel fails to perform actions it is supposed to
do.
This means that the particular action is not performed!
 We do not have an omission fault if:
o An action is delayed (regardless how long) but finally executed.
o An action is executed with an erroneous result.
With synchronous systems, omission faults can be detected by timeouts.
 If we are sure that messages arrive, a timeout will indicate that the sending
process has crashed. Such a system has‘fail-stop’ behavior.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Arbitrary (Byzantine) Faults


 This is the most general and worst possible fault semantics.
 Intended processing steps or communications are omitted or/and
unintended ones are executed.
 Results may not come at all or may come but carry wrong values.

Timing Faults
Timing faults can occur in synchronous distributed systems, where time limits are
set to process execution, communications, and clock drifts.
A timing fault occurs if any of these time limits is exceeded.

COMMUNICATION IN DISTRIBUTED SYSTEM

Communication Models and their Layered Implementation

The communication between distributed objects


by means of two models:
1. Remote Method Invocation (RMI)
2. Remote Procedure Call (RPC).

RMI, as well as RPC, are based on request and


reply primitives. Request and reply are
implemented based on the network protocol
(e.g. TCP or UDP in case of the Internet).

Network Protocol
Middleware and distributed applications have to be implemented on top of a
network protocol. Such a protocol is implemented as several layers.
In case of the Internet:

TCP (Transport Control Protocol) and UDP (User


Datagram Protocol) are both transport protocols
implemented on top of the Internet protocol (IP).

TCP (Transport Control Protocol)


TCP is a reliable protocol.
TCP guarantees the delivery to the receiving
process of all data delivered by the sending process, in the same order.

 TCP implements additional mechanisms on top of IP to meet reliability


lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

guarantees.
o Sequencing: A sequence number is attached to each transmitted
segment (packet). At the receiver side, no segment is delivered until
all lower numbered segments have been delivered.
o Flow control: The sender takes care not to overwhelm the receiver
(or intermediate nodes). This is based on periodic acknowledgements
received by the sender from the receiver.
o Retransmission and duplicate handling: If a segment is not
acknowledged within a specified timeout, the sender retransmits it.
Based on the sequence number, the receiver is able to detect and
reject duplicates.
o Buffering: Buffering is used to balance the flow between sender and
receiver. If the receiving buffer is full, incoming segments are
dropped. They will not be acknowledged and the sender will
retransmit them.
o Checksum: Each segment carries a checksum. If the received
segment doesn’t match the checksum, it is dropped (and will be
retransmitted)

UDP (User Datagram Protocol)


UDP is a protocol that does not guarantee reliable transmission.
 UDP offers no guarantee of delivery. According to the IP, packets may be
dropped because of congestion or network error. UDP adds no additional
reliability mechanism to this.
 UDP provides a means of transmitting messages with minimal additional
costs or transmission delays above those due to IP transmission. Its use is
restricted to applications and services that do not require reliable delivery
of messages.
 If reliable delivery is requested with UDP, reliability mechanisms have to be
implemented at the application level.

Request and Reply Primitives


Communication between processes and objects in a distributed system is
performed by message passing.
In a typical scenario (e.g. client-server model) such a communication is through
request and reply messages
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Request-Reply Communication in a Client-Server Model


The system is structured as a group of processes
(objects), called servers that deliver services to
clients.

Remote Method Invocation (RMI) and Remote Procedure Call (RPC)

The goal: make, for the programmer, distributed


computing look like centralized computing.
The solution:
 Asking for a service is solved by the client
issuing a simple method invocation or
procedure call; because the server can be
on a remote machine this is a remote
invocation (call).
 RMI (RPC) is transparent: the calling object
(procedure) is not aware that the called one
is executing on a different machine, and vice versa.

Implementation of RMI
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Who are the players?


 Object A asks for a service
 Object B delivers the service

1. The proxy for object B


 If an object A holds a remote reference to a (remote) object B, there exists
a proxy object for B on the machine which hosts A. The proxy is created
when the remote object reference is used for the first time. For each

 The proxy is the local representative of the remote object ⇒ the remote
method in B there exists a corresponding method in the proxy.

invocation from A to B is initially handled like a local one from A to the


proxy for B.
 At invocation, the corresponding proxy method marshals the arguments
and builds the message to be sent, as a request, to the server.
 After reception of the reply, the proxy unmarshals the received message
and sends the results, in an answer, to the invoker.

2. The skeleton for object B


 On the server side, there exists a skeleton object corresponding to a class, if
an object of that class can be accessed by RMI. For each method in B there
exists a corresponding method in the skeleton.
 The skeleton receives the request message, unmarshals it and invokes the
corresponding method in the remote object; it waits for the result and
marshals it into the message to be sent with the reply.
 A part of the skeleton is also called dispatcher. The dispatcher receives a
request from the communication module, identifies the invoked method
and directs the request to the corresponding method of the skeleton.

Implementation is as follows

 Object A and Object B belong to the application.


 Remote reference module and communication module belong to the
middleware.
 The proxy for B and the skeleton for B represent the so called RMI
software. They are situated at the border between middleware and
application and usually can be generated automatically with help of
available tools that are delivered together with the middleware software.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Communication module
 The communication modules on the client and server are responsible of
carrying out the exchange of messages which implement the
request/reply protocol needed to execute the remote invocation.
 The particular messages exchanged and the way errors are handled,
depends on the RMI semantics which is implemented (see slide 40).

Remote reference module


 The remote reference module translates between local and remote object
references. The correspondence between them is recorded in a remote
object table.
 Remote object references are initially obtained by a client from a so called
binder that is part of the global name service (it is not part of the remote
reference module). Here servers register their remote objects and clients
look up after services.

Remote Procedure Call (RPC)


lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

TIME AND STATE IN DISTRIBUTED SYSTEMS


Because each machine in a distributed system has its own clock there is no notion
of global physical time.
The n crystals on the n computers will run at slightly different rates, causing the
clocks gradually to get out of synchronization and give different values.

Problems:
 Time triggered systems: these are systems in which certain activities are
scheduled to occur at predefined moments in time. If such activities are to
be coordinated over a distributed system we need a coherent notion of
time.
Example: time-triggered real-time systems
 Maintaining the consistency of distributed data is often based on the time
when a certain modification has been performed.
Example: a make program.

The make-program example

When the programmer has finished changing some source files he starts make;
make examines the times at which all object and source files were last modified
and decides which source files have to be recompiled.

In the given example, Although P.c is


modified after P.o has been generated;
because of the clock drift the time
assigned to P.c is smaller.

i.e. P.c will not be recompiled for the


new version!

Solutions:
1. Synchronization of physical clocks

known, degree of accuracy ⇒ within the bounds of this accuracy we can


 Computer clocks are synchronized with one another to an achievable,

coordinate activities on different computers using each computer’s local


clock.
 Physical clock synchronization is needed for distributed real-time systems.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

2. Logical clocks

 In many applications we are not interested in the physical time at which


events occur; what is important is the relative order of events! The make-
program is such an example.
 In such situations we don’t need synchronized physical clocks. Relative
ordering is based on a virtual notion of time - logical time.
 Logical time is implemented using logical clocks.

Lamport’s Logical Clocks


1. The order of events occurring at different processes is critical for many
distributed applications. Example: P.o_created and P.c_created.

2. Ordering can be based on two simple situations:


I. If two events occurred in the same process then they occurred in the
order observed following the respective process;
II. Whenever a message is sent between processes, the event of
sending the message occurred before the event of receiving it.

3. Ordering by Lamport is based on the happened before relation (denoted by


→):
I. a → b, if a and b are events in the same process and a occurred
before b;
II. a→b, if a is the event of sending a message m in a process, and b is
the event of the same message m being received by another process;
III. If a → b and b → c, then a → c (the relation is transitive).
4. Ifa→b, we say that event a causally affects event b. The two events are
causally related.
5. There are events which are not related by the happened-before relation. If
both a → e and e → a are false, then a and e are concurrent events; we can
write a||e.

P1, P2, P3: processes;


a, b, c, d, e, f: events;
a → b, c → d, e → f, b → c, d → f
a → c, a → d, a → f, b → d, b → f, ...
a || e, c || e, ...

6. Using physical clocks, the happened before relation cannot be captured. It is


lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

possible that b → c and at the same time Tb > Tc (Tb is the physical time of b).
7. Logical clocks can be used in order to capture the happened-before relation.
 A logical clock is a monotonically increasing software counter.
 There is a logical clock CPi at each process Pi in the system.
 The value of the logical clock is used to assign timestamps to events. CPi(a)
is the timestamp of event a in process Pi.
 There is no relationship between a logical clock and any physical clock.

To capture the happened-before relation, logical clocks have to be implemented


so that if a → b, then C(a) < C(b)

Implementation Rules of Lamport’s Logical Clocks


Implementation of logical clocks is performed using the following rules for
updating the clocks and transmitting their values in messages:

[R1]:
 CPi is incremented before each event is issued at process Pi: CPi := CPi +
1. [R2]:
a) When ‘a’ is the event of sending a message ‘m’ from process Pi, then the
timestamp tm = CPi (a) is included in ‘m’. (CPi(a) is the logical clock value
obtained after applying rule R1).
b) On receiving message ‘m’ by process Pj, its logical clock CPj is updated as
follows: CPj := max(CPj, tm).
c) The new value of CPj is used to timestamp the event of receiving message
‘m’ by Pj (applying rule R1).

 If ‘a’ and ‘b’ are events in the same process and ‘a’
occurred before ‘b’, then a→b, and (by R1) C(a) <
C(b).
 If ‘a’ is the event of sending a message ‘m’ in a
process, and ‘b’ is the event of the same message
‘m’ being received by another process, then a→b,
and (by R2) C(a) < C(b).
 If a → b and b → c, then a → c, and (by induction)
C(a) < C(c).
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Problems with Lamport’s Logical Clocks


Lamport’s logical clocks impose only a partial order on the set of events; pairs of
distinct events generated by different processes can have identical timestamp.
 For certain applications a total ordering is needed; they consider that no
two events can occur at the same time.
 In order to enforce total ordering a global logical timestamp is introduced:
o the global logical timestamp of an event a occurring at process Pi,
with logical timestamp CPi(a), is a pair (CPi(a), i), where i is an
identifier of process Pi;
o we define
(CPi(a), i) < (CPj(b), j) if and only if
CPi(a) < CPj(b),
or
CPi(a) = CPj(b) and i < j.

Disadvantages of Lamport’s Clock


Lamport’s logical clocks are not powerful enough to perform a causal ordering of
events.
 If a → b, then C(a) < C(b).
However, the reverse is not always true (if the events occurred in different
processes):
 If C(a) < C(b), then a → b is not necessarily true. ( It is only guaranteed that
b → a is not true).

In the figure C(e) < C(b), however there is no causal


relation from event e to event b.
By just looking at the timestamps of the events, we
cannot say whether two events are causally related
or not.

 We would like messages to be


processed according to their causal
order. We would like to use the
associated timestamp for this purpose.
 Process P3 receives messages M1, M2,
and M3. send(M1) → send(M2),
send(M1) → send(M3), send(M3) ||
send(M2)

 M1 has to be processed before M2 and M3. However P3 has not to wait for
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

M3 in order to process it before M2 (although M3’s logical clock timestamp


is smaller than M2’s).

Vector Clocks
Vector clocks give the ability to decide whether two events are causally related or
not by simply looking at their timestamp.
 Each process Pi has a clock CPi, which is an integer vector of length n (n is
the number of processes).
 The value of CPi is used to assign timestamps to events in process Pi.
CvPi(a) is the timestamp of event a in process Pi.
 CPi[i], the ith entry of CPi, corresponds to Pi’s own logical time.
 CPi*j+, j ≠ i, is Pi’s "best guess" of the logical time at Pj.
 CPi[j] indicates the (logical) time of occurrence of the last event at Pj which
is in a happened before relation to the current event at Pi.

Implementation of Vector Clock


Implementation of vector clocks is performed using the following rules for
updating the clocks and transmitting their values in messages:
[R1]:
 CPi is incremented before each event is issued at process
Pi: CPi[i] := CPi[i] + 1.
[R2]:
a) When ‘a’ is the event of sending a message ‘m’ from process Pi, then the
timestamp tm = CPi(a) is included in ‘m’ (CPi(a) is the vector clock value
obtained after applying rule R1).
b) On receiving message ‘m’ by process Pj, its vector clock CPj is updated as

∀k ∈ {1, 2, .., n}, CPj[k] := max(CPj[k], tm[k]).


follows:

c) The new value of CPj is used to timestamp the event of receiving message
‘m’ by Pj (applying rule R1).

For any two vector timestamps u and v, we

 u = v if and only if ∀i, u[i] = v[i]


have:

 u ≤ v if and only if ∀i, u*i+ ≤ v*i+


 u < v if and only if (u ≤ v 𝖠 u ≠ v)
 u || v if and only if ¬(u < v) 𝖠 ¬(v < u)
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Causal Order
 Two events a and b are causally related if and only if
o C(a) < C(b) or C(b) < C(a). Otherwise the events are concurrent.
 With vector clocks we get the property which we missed for Lamport’s
logical clocks:
o a → b if and only if Cv(a) < Cv(b). Thus, by just looking at the
timestamps of the events, we can say whether two events are
causally related or not.

Causal Ordering of Messages Using Vector Clocks

We would like messages to be processed


according to their causal order.

If Send(M1) → Send(M2), then every


recipient of both messages M1 and M2 must
receive M1 before M2.

Basic Idea:
A message is delivered to a process only if the message immediately preceding it
(considering the causal ordering) has been already delivered to the process.
Otherwise, the message is buffered.
We assume that processes communicate using broadcast messages. (There exist

The events which are of interest here are the sending of messages ⇒ vector
similar protocols for non-broadcast communication too.)

clocks will be incremented only for message sending.

Implementation of Causal Order

BIRMAN-CHIPER-STEPHENSON PROTOCOL (BES)


Implementation of the protocol is based on the following rules:
[R1]:
a) Before broadcasting a message m, a process Pi increments the vector clock:
CPi[i] := CPi[i] + 1.
b) The timestamp tm = CPi is included in
m. [R2]:
The receiving side, at process Pj, delays the delivery of message ‘m’ coming from
Pi until both the following conditions are satisfied:
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

2. ∀k ∈ {1,2,..,n} - {i}, CPj*k+ ≥ tm*k+


1. CPj[i] = tm[i] - 1

Delayed messages are queued at each process in a queue that is sorted by their
vector timestamp; Concurrent messages are ordered by the time of their arrival.

[R3]:
When a message is delivered at process Pj, its vector clock CPj is updated
according to rule [R1:b] for vector clock implementation.
 tm[i] - 1 indicates how many messages originating from Pi precede m.
 Step [R2.1] ensures that process Pj has received all the messages
originating from Pi that precede m.
 Step [R2.2] ensures that Pj has received all those messages received by Pi
before sending m.

SCHIPER-EGGLI-SANDOZ PROTOCOL (SES)


Basic idea is same as BES but no need for broadcast messages.
Each process maintains a vector V_P of size N - 1, N the number of processes in
the system.
V_P is a vector of tuple (P’,t): P’ the destination process id and t, a vector
timestamp.
Tm: logical time of sending message
m Tpi: present logical time at pi
Initially, V_P is empty.
Algorithm as follows

Sending a Message:
 Send message M, time stamped tm, along with V_P1 to P2.
 Insert (P2, tm) into V_P1. Overwrite the previous value of (P2,t), if any.
 (P2,tm) is not sent. Any future message carrying (P2,tm) in V_P1 cannot be
delivered to P2 until tm < tP2.

Delivering a message
 If V_M (in the message) does not contain any pair (P2, t), it can be
delivered.
 /* (P2, t) exists */ If t ≥ Tp2, buffer the message. (Don’t deliver).
 else (t < Tp2) deliver it

What does the condition t ≥ Tp2 imply?


 t is message vector time stamp.
 t > Tp2 -> For all j, t[j] > Tp2[j]
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

 This implies some events occurred without P2’s knowledge in other


processes. So P2 decides to buffer the message.
 When t < Tp2, message is delivered & Tp2 is updated with the help of V_P2
(after the merge operation).

Example:

 M1 from P2 to P1: M1 + Tm (=<0,1,0>) + Empty V_P2


 M2 from P2 to P3: M2 + Tm (<0, 2, 0>) + (P1, <0,1,0>)
 M3 from P3 to P1: M3 + <0,2,2> + (P1, <0,1,0>)
 M3 gets buffered because:
o Tp1 is <0,0,0>, t in (P1, t) is <0,1,0> & so Tp1 < t
 When M1 is received by P1:
o Tp1 becomes <1,1,0>, by rules 1 and 2 of vector clock.
 After updating Tp1, P1 checks buffered M3.
o Now, Tp1 > t [in (P1, <0,1,0>].
o So M3 is delivered.
 On delivering the message:
o Merge V_M (in message) with V_P2 as follows.
o If (P,t) is not there in V_P2, merge.
o If (P,t) is present in V_P2, t is updated with max(t[i] in Vm, t[i] in
V_P2). {Component-wise maximum}.
o Message cannot be delivered until t in V_M is greater than t in V_P2
o Update site P2’s local, logical clock.
o Check buffered messages after local, logical clock update.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Global States
Problem: How to collect and record a consistent global state in a distributed
system.
Why a problem?
Because there is no global clock (no coherent notion of time) and no shared
memory!

Consider a bank system with two accounts A and B at two different sites; we
transfer $50 between A and B.

 In general, a global state consists of a set of local states and a set of states
of the communication channels.
 The state of the communication channel in a consistent global state should
be the sequence of messages sent along the channel before the sender’s
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

state was recorded, excluding the sequence of messages received along the

 It is difficult to record channel states to ensure the above rule ⇒


channel before the receiver’s state was recorded.

global states are very often recorded without using channel states.

This is the case in the definition below.

Formal Definition
 LSi is the local state of process Pi. Beside other information, the local state
also includes a record of all messages sent and received by the process.
 We consider the global state GS of a system, as the collection of the local
states of its processes: GS = {LS1, LS2, ..., LSn}.
 A certain global state can be consistent or not!
 send(Mij) denotes the event of sending message Mij from Pi to Pj;

 send(Mij) ∈ LSi if and only if the sending event occurred before the
 rec(Mij) denotes the event of receiving message Mij by Pj.

 rec(Mij) ∈ LSj if and only if the receiving event occurred before the
local state was recorded;

 transit(LSi,LSj) = {Mij | send(Mij) ∈LSi 𝖠 rec(Mij) ∉LSj} inconsistent(LSi,LSj) =


local state was recorded.

{Mij | send(Mij) ∉LSi 𝖠 rec(Mij) ∈LSj}

Consistent Global State

∀i, ∀j: 1 ≤ i, j ≤ n :: inconsistent(LSi,LSj) = ∅


A global state GS = {LS1, LS2, ..., LSn} is consistent if and only if:

 In a consistent global state for every received message a corresponding


send event is recorded in the global state.
 In an inconsistent global state, there is at least one message whose receive
event is recorded but its send event is not recorded.

Transit less Global State

∀i, ∀j: 1 ≤ i, j ≤ n :: transit(LSi,LSj) = ∅


A global state GS = {LS1, LS2, ..., LSn} is transitless if and only if:

 All messages recorded to be sent are also recorded to be received.

Strongly Consistent Global State


A global state is strongly consistent if it is consistent and transitless.
A strongly consistent state corresponds to a consistent state in which all messages
recorded as sent are also recorded as received.

The global state is seen as a collection of the local states, without explicitly
capturing the state of the channel.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Example
{LS11, LS22, LS32} is inconsistent;
{LS12, LS23, LS33} is consistent;
{LS11, LS21, LS31} is strongly consistent.

Global State Recording

Chandy-Lamport

Algorithm

 The algorithm records a collection of local states which give a consistent


global state of the system. In addition it records the state of the channels
which is consistent with the collected global state.
 Such a recorded "view" of the system is called a snapshot.
 We assume that processes are connected through one directional channels
and message delivery is FIFO.
 We assume that the graph of processes and channels is strongly connected
(there exists a path between any two processes).
 The algorithm is based on the use of a special message, snapshot token, in
order to control the state collection process.

How to collect a global state

 A process Pi records its local state LSi and later sends a message ‘m’ to Pj;
LSj at Pj has to be recorded before Pj has received m.
 The state SChij of the channel Chij consists of all messages that process Pi
sent before recording LSi and which have not been received by Pj when
recording LSj.
 A snapshot is started at the request of a particular process Pi, for example,
when it suspects a deadlock because of long delay in accessing a resource;
Pi then records its state LSi and, before sending any other message, it sends
a token to every Pj that Pi communicates with.
 When Pj receives a token from Pi, and this is the first time it received a
token, it must record its state before it receives the next message from Pi.
After recording its state Pj sends a token to every process it communicates
with, before sending them any other message.
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

Algorithm
Rule for sender Pi:
/* performed by the initiating process and by any other process at the reception
of the first token */
[SR1]:
Pi records its state.
[SR2]:
Pi sends a token on each of its outgoing channels.

Rule for receiver Pj:


/* executed whenever Pj receives a token from another process Pi on channel Chij
*/
[RR1]:
If
Pj has not yet recorded its

Record the state of the channel: SChij := ∅.


state Then

Follow the "Rule for sender".


Else
Record the state of the channel: SChij := M,
Where M is the set of messages that Pj received from Pi after Pj recorded
its state and before Pj received the token on Chij.
End if.

Huang’s Termination Detection Algorithm


Model
 Processes can be active or idle
 Only active processes send messages
 Idle process can become active on receiving an computation message
 Active process can become idle at any time
 Termination: all processes are idle and no computation message are in
transit
 Can use global snapshot to detect termination also
 One controlling agent, has weight 1 initially
 All other processes are idle initially and has weight 0
 Computation starts when controlling agent sends a computation message
to a process
 An idle process becomes active on receiving a computation message
 B(DW) – computation message with weight DW. Can be sent only by the
controlling agent or an active process
 C(DW) – control message with weight DW, sent by active processes to
lOMoARcPSD|47077467

B. Tech-VI SEM / DISTRIBUTED SYSTEM ( KCS077)/ UNIT – I / Nida Rahman / Assistant Professor / Department of
CSE / JMS Institute of TechnologyGhaziabad DISTRIBUTED SYSTEM ( KCS077)

controlling agent when they are about to become idle

Let current weight at process = W


1. Send of B(DW):
• Find W1, W2 such that W1 > 0, W2 > 0, W1 + W2 = W
• Set W = W1 and send B(W2)
2. Receive of B(DW):
• W=W + DW;
• if idle, become active
3. Send of C(DW):
• send C(W) to controlling agent
• Become idle
4. Receive of C(DW):
• W = W + DW
• if W = 1, declare “termination”

You might also like