0% found this document useful (0 votes)
43 views65 pages

6cs5 Ds Unit 4 Unit 4

The document presents an overview of Distributed Shared Memory (DSM) systems, highlighting their role in distributed operating systems and the concept of shared memory in a distributed context. It discusses various memory consistency models, including strict, sequential, causal, pipelined random access memory, processor, weak, and release consistency models, outlining their characteristics and requirements. Additionally, it covers non-uniform memory access architectures and their implications for scalability and performance in distributed systems.

Uploaded by

Jitesh Kumawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views65 pages

6cs5 Ds Unit 4 Unit 4

The document presents an overview of Distributed Shared Memory (DSM) systems, highlighting their role in distributed operating systems and the concept of shared memory in a distributed context. It discusses various memory consistency models, including strict, sequential, causal, pipelined random access memory, processor, weak, and release consistency models, outlining their characteristics and requirements. Additionally, it covers non-uniform memory access architectures and their implications for scalability and performance in distributed systems.

Uploaded by

Jitesh Kumawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

lOMoARcPSD|55959726

6CS5 DS Unit-4 - unit 4 ppt

Distributed systems (Rajasthan Technical University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES, JAIPUR Distributed System CS/ VISem

ARYA GROUP OF COLLEGES


Arya College of Engineering & Research Centre, Jaipur

7/12/2021
Department of
Computer Science & Engineering

Subject Name with code: Distributed System (DS) (6CS5-11)


UNIT-IV-Distributed Shared Memory (DSM)

Presented by: Ms. Rashi Jain


Assistant Professor, CSE 1

Ms.Rashi jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS/VI Sem

UNIT-IV

7/12/2021
Distributed Shared
Memory

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Subject Detailed syllabus

7/12/2021
3

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Outline

7/12/2021
• Introduction of DSM
• Non-Uniform Memory Access Architectures
• Memory Consistency Models
• Multiprocessor Cache Systems
• Implementation of DSM Systems

• Models of Distributed Computation:


-Preliminaries
- Causality
- Distributed Snapshots
- Modelling a Distributed Computation
- Failures in Distributed System
- Distributed Mutual Exclusion
- Election
4
- Distributed Deadlock handling
- Distributed Termination Detection
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Distributed shared memory system

7/12/2021
Distributed shared memory(DSM) system is a resource management component
of distributed operating system that implements shared memory model in distributed
system which have no physically shared memory. The shared memory model provides a virtual
address space which is shared by all nodes in a distributed system.

Distributed shared memory (DSM) is a form of memory architecture where physically separated
memories can be addressed as one logically shared address space.

Note: Here, the term "shared" does not mean that there is a single centralized memory, but that the
address space is "shared" (same physical address on two processors refers to the same location in
memory). 5

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

Distributed System CS VI Sem


ARYA GROUP OF COLLEGES JAIPUR

DSM cont...
o A distributed-memory system, often called a multicomputer, consists of multiple
independent processing nodes with local memory modules which is connected by a

7/12/2021
general interconnection network.

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Non-uniform memory access architectures

7/12/2021
NUMA is a type of parallel processing architecture.

There are two types of parallel processing architectures – Shared Memory


Architecture and Distributed Memory Architecture. Shared Memory
Architectures are of two types – Uniform Memory Access (UMA) and Non
Uniform Memory Access (NUMA).

In Non Uniform Memory Access (NUMA) as shown in Figure 2, each processor


has its own local memory. A processor can also have a built-in memory
controller as present in Intel’s Quick Path Interconnect (QPI) NUMA
Architecture.
Unlike Distributed Memory Architecture, the memory of other processor is 7
accessible but the latency to access them
Ms.Rashi is not same. The memory which is
Jain
local to other processor is called as remote memory or foreign memory.
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Numa cont...

7/12/2021
A processor usually uses its local memory
to store the data required for its processing.
Accessing a local memory has least latency.
It can also utilize the remote memory.
Scalability is not an issue even if the count
of processors grow up in this architecture.

In Cache Coherent NUMA (ccNUMA), a


processor can be directly linked via
HyperTransport bus to other.

NUMA provides better scalability than


UMA when number of processors is very 8
large. Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Memory consistency models

7/12/2021
Consistency requirement vary from application to application.

A consistency model basically refers to the degree of consistency that has


to be maintained for the shared memory data.

Defined as a set of rules that application must obey if they want the DSM
system to provide the degree of consistency guaranteed by the consistency
model.
If a system support the stronger consistency model then the weaker
consistency model is automatically supported but the converse is not true
9

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Types of Memory consistency models

7/12/2021
Strict Consistency model
Sequential Consistency model
Causal consistency model
Pipelined Random Access Memory consistency model(PRAM)
Processor Consistency model
Weak consistency model
Release consistency model

10

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Strict consistency model

7/12/2021
This is the strongest form of memory coherence having the most stringent
consistency requirement.

Value returned by a read operation on a memory address is always same as


the value written by the most recent write operation to that address.

All writes instantaneously become visible to all processes.

Implementation of the strict consistency model requires the existence of an


absolute global time.
11

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Strict consistency model cont...

7/12/2021
Absolute synchronization of clock of all the nodes of a distributed system is
not possible.

Implementation of strict consistency model for a DSM system is practically


impossible.

If the three operations read(r1), write(w1), read(r2) are performed on a


memory location in that order.

Only acceptable ordering for a strictly consistency memory is (r1, w1, r2)
12

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Sequential consistency model

7/12/2021
Proposed by Lamport [1979].

A shared memory system is said to support the sequential consistency model if all
processes see the same order.

Exact order of access operations are interleaved does not matter.

If the three operations read(r1), write(w1), read(r2) are performed on a memory


location in that order.

Any of the orderings (r1, w1, r2), (r1, r2, w1), (w1, r1, r2), (w1, r2, r1), (r2, r1, w1),
13
(r2, w1, r1) is acceptable provided all processes see the same ordering
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Sequential consistency model cont...

7/12/2021
The consistency requirement of the sequential consistency model is weaker
than that of the strict consistency model.

A sequentially consistency memory provide one-copy /single-copy semantics.

Sequentially consistency is acceptable by most applications.

14

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Casual consistency model cont...

7/12/2021
Proposed by Hutto and Ahamad (1990).

All processes see only those memory reference operations in the correct
order that are potentially causally related.

Memory reference operations not related may be seen by different


processes in different order.

Memory reference operation is said to be related to another memory


reference operation if one might have been influenced by the other.

15
Maintaining dependency graphs for memory access operations.
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Pipelined random access memory consistency


model (PRAM)

7/12/2021
Proposed by Lipton and Sandberg (1988).

Provides a weaker consistency semantics than the consistency model


described so far.

Ensures that all write operations performed by a single process are seen by
all other processes in the order in which they were performed.

All write operations performed by a single process are in a pipeline.

Write operations performed by different processes can be seen by different


16
processes in different order.
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Pipelined random access memory consistency


model (PRAM) cont...

7/12/2021
If w11 and w12 are two write operations performed by a process P1 in that
order, and w21 and w22 are two write operations performed by a process
P2 in that order.

A process P3 may see them in the order [(w11,w12), (w21,w22)] and


another process P4 may see them in the order [(w21,w22), (w11,w12)].

Simple and easy to implement and also has good performance.

PRAM consistency all processes do not agree on the same order of memory
reference operations 17

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Processor consistency model


Proposed by Goodman [1989].

7/12/2021
Very similar to PRAM model with additional restriction of memory coherence.

Memory coherence means that for any memory location all processes agree on the
same order of all write operations performed on the same memory location (no
matter by which process they are performed) are seen by all processes in the
same order.

If w12 and w22 are write operations for writing the same memory location x, all
processes must see them in the same order- w12 before w22 or w22 before w12.

Processes P3 and P4 must see in the same order, which may be either [(w11,w12),
18
(w21,w22)] or [(w21,w22), (w11,w12)]
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

weak consistency model


Proposed by Dubois [1988].

7/12/2021
Common characteristics to many application:

1. It is not necessary to show the change in memory done by every write


operation to other processes eg. when a process executes in a critical
section.
2. Isolated accesses to shared variable are rare.

Better performance can be achieved if consistency is enforced on a group of


memory reference operations rather than on individual memory reference
19
operations
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

weak consistency model cont...


DSM system that support the weak consistency model uses a special

7/12/2021
variable called a synchronization variable.

Requirements:
1. All accesses to synchronization variables must obey sequential
consistency semantics.
2. All previous write operations must be completed everywhere before an
access to a synchronization variable is allowed.
3. All previous accesses to synchronization variables must be completed
before access to a non synchronization variable is allowed.

20

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Release consistency model


Enhancement of weak consistency model.

7/12/2021
Use of two synchronization variables.

- Acquire (used to tell the system it is entering CR).


- Release (used to tell the system it has just exited CR).

Acquire results in propagating changes made by other nodes to process ‘s


node.

Release results in propagating changes made by the process to other nodes


21

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Release consistency model cont...


Barrier defines the end of a phase of execution of a group of concurrently
executing processes.

7/12/2021
Barrier can be implemented by using a centralized barrier server .

Requirements:
1 All accesses to acquire and release synchronization variable obey processor
consistency semantics.
2 All previous acquires perform by a process must be completed successfully before
the process is allowed to perform a data access operation on the memory.
3 All previous data access operations performed by a process must be completed
successfully before a release access done by the process is allowed.

A variation of release consistency is lazy release consistency proposed by Keleher


22

[1992] Ms.Rashi Jain


Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Multiprocessor cache systems


In a shared memory multiprocessor system with a
separate cache memory for each processor, it is possible to have many

7/12/2021
copies of shared data: one copy in the main memory and one in the
local cache of each processor that requested it.

23

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Multiprocessor cache systems cont...

7/12/2021
24

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Implementation of dsm systems


Distributed Shared Memory abbreviated as DSM is the implementation of shared memory
concept in distributed systems. The DSM system implements the shared memory models in
loosely coupled systems that are deprived of a local physical shared memory in the system.

In this type of system distributed shared memory provides a virtual memory space that is
accessible by all the system (also known as nodes) of the distributed hierarchy.

7/12/2021
Some common challenges that are to be kept in mind while the implementation of DSM −
Tracking of the memory address (location) of data stored remotely in shared memory.
To reduce the communication delays and high overhead associated with the references to remote
data.
Controlling the concurrent access of the data shared in DSM.

Based on these challenges there are algorithms designed to implement distributed shared memory.
There are four algorithms −
-Central Server Algorithm
-Migration Algorithm
-Read Replication Algorithm 25
-Full Replication Algorithm Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Central server algorithm


All shared data is maintained by the central
server. Other nodes of the distributed
system request for reading and writing data to
the server which serves the request and updates or
provides access to the data along

7/12/2021
with acknowledgment messages.

These acknowledgment messages are used to provide


the status of the data request is served by the server.
When the data is sent to the calling function, it
acknowledges a number that shows the access
sequence of the data to maintain concurrency. And
time-out is returned in case of failure.

For larger distributed systems, there can be more


than one server. In this case, the servers are located
using their address or using mapping functions. 26

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

migration algorithm
As the name suggest the migration
algorithm does the work of migration of
data elements. Instead of using a central
server serving each request, the block

7/12/2021
containing the data requested by a
system is migrated to it for further
access and processing. It migrates the
data on request.

This algorithm though is good if when a


system accesses the same block of data
multiple times and the ability to
integrate virtual memory concept, has
some shortcomings that are needed to be 27
addressed. Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Read replication algorithm


In the read replication algorithm, the
data block that is to be accessed
is replicated and only reading is
allowed in all the copies. If a write

7/12/2021
operation is to be done, then all read
access is put on halt till all the copies are
updated.

Overall system performance is improved


as concurrent access is allowed. But
write operation is expensive due to
the requirement of updating all blocks
that are shared to maintain concurrency.
All copies of data element are to be
28
tracked to maintain consistency.
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

full replication algorithm


An extension to read the replication
algorithm allowing the nodes to perform
both read and write operation on the
shared block of concurrently. But this

7/12/2021
access of nodes is controlled to maintain
its consistency.

To maintain consistency of data on


concurrent access of all nodes sequence is
maintained and after every modification
that is made in the data a
multicast with modifications is reflected
all the data copies.
29

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Models of distributed computation


The model of distributed computation can be described by using points that is
as follows:
1) Preliminaries
2) Causality

7/12/2021
3) Distributed Snapshots
4) Modelling a distributed computation
5) Failure in DS
6) Distributed mutual exclusion
7) Election
8) Distributed Deadlock handling
9) Distributed Termination detection
10) Detection of dynamic termination
- The STD algorithm
30
- The DTD algorithm
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Preliminaries & causality

7/12/2021
31

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Preliminaries & causality cont...

7/12/2021
32

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Distributed snapshots
A snapshot algorithm is used to create a
consistent snapshot of the global state of
a distributed system. Due to the lack of globally
shared memory and a global clock, this isn't trivially
possible.

7/12/2021
A distributed snapshot algorithm captures a
consistent global state of a distributed system. A
global state can be described by a cut that indicates
the time at which each process “checkpoints” its local
state and messages. In the case of a consistent cut C
(Fig), if a message crosses C, its “send” should be
before C and its “receive” should be after C.

When the system is recovered from a consistent cut,


every message will be sent exactly once. If a message’s
“send” is after C while its “receive” is before C, C
becomes inconsistent (Fig 2)’. It will cause problems
when the processes are restarted from an inconsistent
cut. For example, message m2 can be executed twice in
Fig. 33

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Distributed snapshots cont...

7/12/2021
34

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Failure in DS
DSM implements distributed systems shared memory model in an
exceedingly distributed system, that hasn’t any physically shared
memory.

7/12/2021
The shared model provides a virtual address space shared between any
numbers of nodes. The DSM system hides the remote communication
mechanism from the appliance author, protecting the programming ease
and quality typical of shared-memory systems.

35

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Failure in DS cont...
Failure recovery is an interesting problem in many applications, but especially in distributed systems, where
there may be multiple devices participating and multiple points of failure.

It’s very educational to identify the distinct roles in a system, and ask for each one, “What would happen
if that part of the system failed?”

7/12/2021
Network failures: participants are still running, but the connection between two or more is lost, or one or more
messages are dropped before reaching the recipient. Some systems may also have issues with unexpected delays
in message delivery.
Crash failures: a participant shuts down unexpectedly. This can occur as a result of application or environment
errors, or simply a loss of power.
Byzantine failures: a participant may act arbitrarily. This may be due to an adversary taking control of a
server, after which they may actively attempt to undermine the system. Byzantine failures remain an open
research area, and are often difficult to handle unless the system was explicitly designed with potentially
compromised participants in mind.
Simultaneous or repeated failures: these are somewhat meta-failures, in which multiple participants fail at
the same time, or a single participant experiences recurring failures.

Fault tolerant systems are those that are able to survive common failures and continue providing service even
while failures are occurring.

A lot of the work that results in failure recovery occurs at the design level of a system, so it’s useful to consider
the types of failures that may occur, the expected frequency and impact of failures, and design choices that may 36
reduce the risk of failures impacting our users.
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Distributed mutual exclusion


Mutual exclusion: Concurrent access of processes to a shared resource
or data is executed in mutually exclusive manner.

7/12/2021
Only one process is allowed to execute the critical section (CS) at any
given time.

In a distributed system, shared variables (semaphores) or a local


kernel cannot be used to implement mutual exclusion.

Message passing is the sole means for implementing distributed


mutual exclusion.
37

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Distributed mutual exclusion cont...


Requirements of Mutual Exclusion Algorithms:

1 Safety Property: At any instant, only one process can execute the

7/12/2021
critical section.

2 Liveness Property: This property states the absence of deadlock and


starvation. Two or more sites should not endlessly wait for messages
which will never arrive.

3 Fairness: Each process gets a fair chance to execute the CS. Fairness
property generally means the CS execution requests are executed in the
order of their arrival (time is determined by a logical clock) in the system.
38

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Solutions to Distributed mutual exclusion cont...


As we know shared variables or a local kernel can not be used to
implement mutual exclusion in distributed systems. Message passing
is a way to implement mutual exclusion. Below are the three
approaches based on message passing to implement mutual exclusion

7/12/2021
in distributed systems:

1. Token Based Algorithm:


⚫ A unique token is shared among all the sites.
⚫ If a site possesses the unique token, it is allowed to enter its critical section
⚫ This approach uses sequence number to order requests for the critical section.
⚫ Each requests for critical section contains a sequence number. This sequence
number is used to distinguish old and current requests.
⚫ This approach insures Mutual exclusion as the token is unique
39
Example: Suzuki-Kasami’s Broadcast Algorithm
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Solutions to Distributed mutual exclusion cont...


2. Non-token based approach:
⚫ A site communicates with other sites in order to determine which sites should
execute critical section next. This requires exchange of two or more successive
round of messages among sites.

7/12/2021
⚫ This approach use timestamps instead of sequence number to order requests
for the critical section.
⚫ When ever a site make request for critical section, it gets a timestamp.
Timestamp is also used to resolve any conflict between critical section
requests.
⚫ All algorithm which follows non-token based approach maintains a logical
clock. Logical clocks get updated according to Lamport’s scheme

Example: Lamport's algorithm, Ricart–Agrawala algorithm


40

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Solutions to Distributed mutual exclusion cont...


3. Quorum based approach:
⚫ Instead of requesting permission to execute the critical section from all other
sites, Each site requests only a subset of sites which is called a quorum.
⚫ Any two subsets of sites or Quorum contains a common site.

7/12/2021
⚫ This common site is responsible to ensure mutual exclusion

Example: Maekawa’s Algorithm

41

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Suzuki-kasami algorithm
Suzuki–Kasami algorithm is a token-based algorithm for achieving mutual exclusion in
distributed systems.This is modification of Ricart–Agrawala algorithm, a permission based
(Non-token based) algorithm which uses REQUEST and REPLY messages to ensure mutual
exclusion.

In token-based algorithms, A site is allowed to enter its critical section if it possesses the unique

7/12/2021
token. Non-token based algorithms uses timestamp to order requests for the critical section
where as sequence number is used in token based algorithms.
Each requests for critical section contains a sequence number. This sequence number is used to
distinguish old and current requests.

Data structure and Notations:

An array of integers RN[1…N]


A site Si keeps RNi[1…N], where RNi[j] is the largest sequence number received so far
through REQUEST message from site Si.
An array of integer LN[1…N]
This array is used by the token.LN[J] is the sequence number of the request that is recently
executed by site Sj.
A queue Q 42
This data structure is used by the token to keep record of ID of sites waiting for the token
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Suzuki-kasami algorithm
Algorithm:

To enter Critical section:


⚫ When a site Si wants to enter the critical section and it does not have the token then it
increments its sequence number RNi[i] and sends a request message REQUEST(i, sn) to
all other sites in order to request the token.

7/12/2021
Here sn is update value of RNi[i]
⚫ When a site Sj receives the request message REQUEST(i, sn) from site Si, it sets RNj[i] to
maximum of RNj[i] and sn i.e RNj[i] = max(RNj[i], sn).
⚫ After updating RNj[i], Site S j sends the token to site Si if it has token and RNj[i] = LN[i] + 1
To execute the critical section:
⚫ Site S executes the critical section if it has acquired the token.
i

To release the critical section:


After finishing the execution Site Si exits the critical section and does following:
⚫ sets LN[i] = RNi[i] to indicate that its critical section request RNi[i] has been executed
⚫ For every site S j, whose ID is not present in the token queue Q, it appends its ID
to Q if RNi[j] = LN[j] + 1 to indicate that site Sj has an outstanding request.
⚫ After above updation, if the Queue Q is non-empty, it pops a site ID from the Q and sends
the token to site indicated by popped ID.
⚫ If the queue Q is empty, it keeps the token
Ms.Rashi Jain
43

Downloaded by Jitesh Kumawat ([email protected])


lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Ricart agrawala algorithm


Ricart–Agrawala algorithm is an algorithm to for mutual exclusion in a distributed
system proposed by Glenn Ricart and Ashok Agrawala. This algorithm is an extension
and optimization of Lamport’s Distributed Mutual Exclusion Algorithm. Like Lamport’s
Algorithm, it also follows permission based approach to ensure mutual exclusion.
In this algorithm:

7/12/2021
Two type of messages ( REQUEST and REPLY) are used and communication channels
are assumed to follow FIFO order.
A site send a REQUEST message to all other site to get their permission to enter
critical section.
A site send a REPLY message to other site to give its permission to enter the critical
section.
A timestamp is given to each critical section request using Lamport’s logical clock.
Timestamp is used to determine priority of critical section requests. Smaller timestamp
gets high priority over larger timestamp. The execution of critical section request is
always in the order of their timestamp.
44

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Ricart agrawala algorithm


Algorithm:
To enter Critical section:
⚫ When a site Si wants to enter the critical section, it send a
timestamped REQUEST message to all other sites.

7/12/2021
When a site Sj receives a REQUEST message from site Si, It sends
a REPLY message to site Si if and only if
Site Sj is neither requesting nor currently executing the critical section.
In case Site Sj is requesting, the timestamp of Site Si‘s request is smaller than its own
request.Otherwise the request is deferred by site S j.
To execute the critical section:
⚫ Site Si enters the critical section if it has received the REPLY message from all
other sites.
To release the critical section:
⚫ Upon exiting site Si sends REPLY message to all the deferred requests.
45

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem


Maekawa’s algorithm
Maekawa’s Algorithm is quorum based approach to ensure mutual
exclusion in distributed systems. As we know, In permission based
algorithms like Lamport’s Algorithm, Ricart-Agrawala Algorithm etc. a
site request permission from every other site but in quorum based
approach, A site does not request permission from every other site

7/12/2021
but from a subset of sites which is called quorum.
In this algorithm:
Three type of messages ( REQUEST, REPLY and RELEASE) are used.
A site send a REQUEST message to all other site in its request set or
quorum to get their permission to enter critical section.
A site send a REPLY message to requesting site to give its permission to
enter the critical section.
A site send a RELEASE message to all other site in its request set or
quorum upon exiting the critical section. 46

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem


Maekawa’s algorithm
Algorithm:
To enter Critical section:
⚫ When a site S i wants to enter the critical section, it sends a request
message REQUEST(i) to all other sites in the request set R i.
⚫ When a site S j receives the request message REQUEST(i) from site Si , it returns

7/12/2021
a REPLY message to site Si if it has not sent a REPLY message to the site from the
time it received the last RELEASE message. Otherwise, it queues up the request..
To execute the critical section:
⚫ A site Si can enter the critical section if it has received the REPLY message from all the
site in request set R i
To release the critical section:
⚫ When a site S i exits the critical section, it sends RELEASE(i) message to all other sites
in request set R i
⚫ When a site S j receives the RELEASE(i) message from site Si , it send REPLY message
to the next site waiting in the queue and deletes that entry from the queue
⚫ In case queue is empty, site Sj update its status to show that it has not sent 47
any REPLY message since the receipt of theJain
Ms.Rashi last RELEASE message
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem


election
Distributed Algorithm is a algorithm that runs on a distributed system.
Distributed system is a collection of independent computers that do not
share their memory. Each processor has its own memory and they
communicate via communication networks.

7/12/2021
Communication in networks is implemented in a process on one machine
communicating with a process on other machine.

Many algorithms used in distributed system require a coordinator that


performs functions needed by other processes in the system. Election
algorithms are designed to choose a coordinator.

48

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Election algorithms
Election algorithms choose a process from group of processors to act as a
coordinator. If the coordinator process crashes due to some reasons, then
a new coordinator is elected on other processor. Election algorithm
basically determines where a new copy of coordinator should be restarted.

7/12/2021
Election algorithm assumes that every active process in the system has a
unique priority number. The process with highest priority will be chosen
as a new coordinator. Hence, when a coordinator fails, this algorithm
elects that active process which has highest priority number. Then this
number is send to every active process in the distributed system.

49

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Election algorithms cont...


We have two election algorithms for two different configurations of distributed system.
1. The Bully Algorithm –
This algorithm applies to system where every process can send a message to every other
process in the system.
Algorithm –

7/12/2021
Suppose process P sends a message to the coordinator.
If coordinator does not respond to it within a time interval T, then it is assumed that
coordinator has failed.
Now process P sends election message to every process with high priority number.
It waits for responses, if no one responds for time interval T then process P elects itself
as a coordinator.
Then it sends a message to all lower priority number processes that it is elected as
their new coordinator.
However, if an answer is received within time T from any other process Q,
⚫ (I) Process P again waits for time interval T’ to receive another message from Q
that it has been elected as coordinator.
⚫ (II) If Q doesn’t responds within time interval T’ then it is assumed to have failed
and algorithm is restarted. 50

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Election algorithms cont...

7/12/2021
51

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

The ring algorithm


This algorithm applies to systems organized as a ring(logically or physically). In
this algorithm we assume that the link between the process are unidirectional
and every process can message to the process on its right only. Data structure
that this algorithm uses is active list, a list that has priority number of all
active processes in the system.

7/12/2021
Algorithm –
If process P1 detects a coordinator failure, it creates new active list which is
empty initially. It sends election message to its neighbour on right and adds
number 1 to its active list.
If process P2 receives message elect from processes on left, it responds in 3
ways:
⚫ (I) If message received does not contain 1 in active list then P1 adds 2 to its active list
and forwards the message.
⚫ (II) If this is the first election message it has received or sent, P1 creates new active list
with numbers 1 and 2. It then sends election message 1 followed by 2.
⚫ (III) If Process P1 receives its own election message 1 then active list for P1 now
52
contains numbers of all the active processes in the system. Now Process P1 detects
highest priority number from list andMs.Rashi
elects it as the new coordinator.
Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

The ring algorithm

7/12/2021
53

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

7/12/2021
54

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

7/12/2021
55

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Distributed deadlock handling


In a distributed system deadlock can neither be prevented nor avoided as
the system is so vast that it is impossible to do so. Therefore, only
deadlock detection can be implemented. The techniques of deadlock
detection in the distributed system require the following:

7/12/2021
Progress –
The method should be able to detect all the deadlocks in the system.
Safety –
The method should not detect false or phantom deadlocks.

There are three approaches to detect deadlocks in distributed systems.


They are as follows:

56

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Distributed deadlock handling cont...


Centralized approach –
In the centralized approach, there is only one responsible resource to detect
deadlock. The advantage of this approach is that it is simple and easy to
implement, while the drawbacks include excessive workload at one node, single-
point failure (that is the whole system is dependent on one node if that node

7/12/2021
fails the whole system crashes) which in turns makes the system less reliable.

Distributed approach –
In the distributed approach different nodes work together to detect deadlocks.
No single point failure (that is the whole system is dependent on one node if
that node fails the whole system crashes) as the workload is equally divided
among all nodes. The speed of deadlock detection also increases.

Hierarchical approach –
This approach is the most advantageous. It is the combination of both
centralized and distributed approaches of deadlock detection in a distributed
system. In this approach, some selected nodes or clusters of nodes are 57

responsible for deadlock detection and these


Ms.Rashi Jainselected nodes are controlled by a
single node. Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Deadlock prevention: wait/wound/die algorithm


Assign each process a global timestamp when it starts.
No two processes should have same timestamp.

Basic idea: "When one process is about to block waiting for a resource that
another process is using, a check is made to see which has a larger timestamp (i.

7/12/2021
e. is younger)."

Somehow put timestamps on each process,


providing the creation time of each process.Suppose a process needs a
resource
already owned by another process.Determine relative ages of both processes.
Decide if waiting process should
⚫ Preempt,
⚫ Wait,
⚫ Die, or
⚫ Wound
the process that owns the resource.
58
The two different algorithms by Rosenkrantz (1978)
use these ideas, as explained below:
Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Deadlock prevention: wait/wound/die algorithm


The Wait-Die algorithm:
Allow wait only if waiting process is
older.
Since timestamps increase in any

7/12/2021
chain of waiting processes,
cycles are impossible.

The Wait-Die algorithm kills the


younger process. When the younger
process restarts and requests the
resource again, it may be killed
once more. This is the less efficient
of these two algorithms.

59

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Deadlock prevention: wait/wound/die algorithm


The Wound-Wait algorithm:
Otherwise allow wait only if waiting
process is younger.
Here timestamps decrease in any chain of
waiting process,

7/12/2021
so cycles are again impossible.
It is wiser to give older processes priority.

The Wound-Wait algorithm preempts the


younger process. When the younger
process re-requests resource, it has to wait
for older process to finish.
This is the better of the two algorithms.

Note: To avoid starvation. a process


should not be assigned a new timestamp
each time it restarts.

60

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Distributed termination detection


The basis of termination detection is in the concept of a distributed
system process' state. At any time, a process in a distributed system is
either in an active state or in an idle state.

7/12/2021
Termination occurs when all processes in the distributed
system become idle and there are no computational messages in transit.

Huang's algorithm is an algorithm for detecting termination in a


distributed system. The algorithm was proposed by Shing-Tsaan Huang
in 1989 in the Journal of Computers.

61

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Huang’s algorithm for termination detection


In a distributed system, a process is either in an active state or in an idle state at any
given point of time. Termination occurs when all of the processes becomes idle and
there are no any in transit(on its way to be delivered) computational message.
Huang's algorithm can be described by the following:
Initially all processes are idle.

7/12/2021
A distributed task is started by a process sending a computational message to another
process. This initial process to send the message is the "controlling agent".
⚫ The initial weight of the controlling agent is {\displaystyle w} (usually 1).
The following rules are applied throughout the computation:
⚫ A process sending a message splits its current weight between itself and the message.
⚫ A process receiving a message adds the weight of the message to itself.
⚫ Upon becoming idle, a process sends a message containing its entire weight back to the
controlling agent and it goes idle.
⚫ Termination occurs when the controlling agent has a weight of {\displaystyle w} and is in the
idle state.
Some weaknesses to Huang's algorithm are that it is unable to detect termination if a
message is lost in transit or if a process fails while in an active state.
62

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

Detection of dynamic termination


It is composed into two parts:
1) STD algorithm (Static)
2) DTD algorithm (Dynamic)

Termination detection in distributed systems has been a popular problem

7/12/2021
of study. A large number of termination detection algorithms have been
proposed for static distributed systems in which the number of nodes present
in the system is fixed and never changes during runtime.

There is relatively less work on dynamic systems, where processes may be


created as well as destroyed while the computation is in progress.

The task of termination detection in dynamic systems is more difficult because


the exact number of processes participating in the computation is not known at
any instant of time. Also, since processes may destroy themselves, the
algorithm has to ensure that (i) the computation does not get partitioned, and
(ii) a process capable of detecting termination always exists in the system. As a63
result, termination detection algorithms for dynamic systems are more complex
than those for static systems. Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])
lOMoARcPSD|55959726

ARYA GROUP OF COLLEGES JAIPUR Distributed System CS VI Sem

7/12/2021
Thank You .....

64

Ms.Rashi Jain
Downloaded by Jitesh Kumawat ([email protected])

You might also like